SQR-052: Proposal for privilege separation in RSP Notebook Aspect containers

  • Adam Thornton

Latest Revision: 2021-02-05

The current design for automated user file storage provisioning requires that the user Lab pod be run without allowPrivilegeEscalation = False. This technote describes how to improve that design to block that avenue of attack.

1   The Problem

In an RSP environment, we need some way to manage resources for users. The most obvious of these resources is per user file storage space (which in our current design is presented as a home directory).

We most definitely do not want to maintain a separate container build per user. Because of the way we are currently handling file permissions and collaboration within the RSP, we create a Unix user with UIDs and GIDs mapped from our authentication provider, and then we provision a home directory for that user if it does not exist.

Currently, we use the provisionator (769:769) account to perform those actions, which are privileged, within the Lab container. In order to do that, the container build writes sudoers rules to allow the provisionator to act. It also allows the provisionator to write a rule (via a script) that allows the provisionator to become the target user before starting Jupyterlab.

This requires sudo in order to work, and therefore the Lab containers cannot currently be run (at least in environments that require filesystem provisioning) with allowPrivilegeEscalation: False. This violates the third item of section 4.1 of SQR-048.

2   The Partial Solution

A year or more ago we introduced a NO_SUDO flag environment variable that can be passed into the container, which bypasses all of the sudo steps to escalate privilege.

This comes with one major caveat: it only works if there is a pre-provisioned user home directory. This is fine at NCSA (which is where the rule was intended for use) since user home space is provisioned as part of account creation and NCSA forces all RSP users to have NCSA accounts; it is not fine in a GKE environment using GitHub as its authentication back-end. We have recently proven that everything still works as expected (assuming a pre-provisioned filesystem) if the container is run with allowPrivilegeEscalation: False in its Kubernetes securityContext.

3   A Better Solution

What we really want is an administrative service, callable from within the RSP Kubernetes cluster, which can perform privileged actions on behalf of users. The first use-case of this service, to which we have given the moniker “Moneypenny,” will be to provision home directories for users.

3.1   Design

In order for this to work, we need a service that can receive web requests, authenticate them via some model, and spawn containers with sufficient privilege to perform the requested work.

For the initial case of provisioning the home directories, the scripts to create those directories with the right user and group IDs already exist. We simply need to move them into a minimal container of their own, and ensure that that container is created with sufficient privilege.

Authentication can (and should) be handled by ingress annotations, just like everything else in the RSP platform. It should require the exec:admin mapping. That will require that nublado/nublado2 gain the ability to request or mint an admin token. The code to create that token already exists within mobu.

However, we may initially want to do request checking within the application itself: until we have restrictive intra-cluster network policies in place, there is nothing preventing any nublado user from talking to the Moneypenny service or endpoint directly, and creating arbitrary user directories. While this isn’t a data corruption problem yet, when Moneypenny becomes able to deprovision user directories (which will surely be a future enhancement), it will be, and in any event a user could create a home directory with an incorrect UID, thus preventing future creation of the directory built correctly. The workflow dispatcher and the RubinSpawner implementation in the Hub already do this sort of token verification in their operation, so borrowing the authorization implementation from one or the other will be easy.

That leaves the web service itself. We should probably use a model like cachemachine, which already provides a RESTful interface via aiohttp, and which also creates containers on the back end. Thus, I propose to start with cachemachine and use a very similar set of objects and classes, but replace the handlers, since the work performed will be different.

3.2   Diagram

_images/Moneypenny.png

Moneypenny Diagram

3.3   Scope of Work

I think the whole project is less than two weeks worth of work for me. The creation of the container to provision user home directories is trivial, as are the required annotations.

Adapting Cachemachine and gluing in an authentication/authorization mechanism should be relatively easy, albeit nontrivial. Modifications to the Hub’s spawning mechanism to acquire a token and then call Moneypenny will also be easy although nontrivial.

Once Moneypenny is functional and deployed in all non-NCSA RSP environments, that will allow us to remove the need for privilege escalation from the Lab for future image builds; that is nearly negligible effort.

4   Conclusion

This is a task that will improve our overall security posture and be fairly little effort. We should undertake it with alacrity.