TrueFoundry users can deploy Jupyter Notebooks on their personal cloud accounts, such as AWS, Azure, or GCP. This feature allows them to conduct machine learning experiments and training jobs on their own machines with ease. Initially, notebooks deployed through TrueFoundry were secured using a username-password combination. However, in response to widespread client requests, we have integrated Single Sign-On. This means users can now conveniently access their notebooks with the same login they use for TrueFoundry. This blog post delves into the specifics of how we implemented this feature.
TrueFoundry internally uses a fork of Kubeflow Notebook Controller to orchestrate the deployment of notebooks. The controller provides various features that we leverage, like:
For context, here’s what a simple Kubeflow Notebook object looks like:
apiVersion: kubeflow.org/v1kind: Notebookmetadata: name: my-notebookspec: template: spec: containers: - name: my-notebook image: kubeflownotebookswg/jupyter:master args: [ "start.sh", "lab", "--LabApp.token=''", "--LabApp.allow_remote_access='True'", "--LabApp.allow_root='True'", "--LabApp.ip='*'", "--LabApp.base_url=/test/my-notebook/", "--port=8888", "--no-browser", ]
Before implementing OAuth2, TrueFoundry provided users with the option to enhance the security of their public notebooks by integrating basic authentication. This added layer of security was crucial to ensure that only authorized individuals could access the sensitive content of these notebooks. To implement this feature, TrueFoundry utilized the capabilities of WebAssembly (Wasm) plugins within the Istio proxy, specifically the Envoy proxy.
Istio, an open-source service mesh, offers a framework for managing network communications between various service workloads. With Istio, TrueFoundry was empowered to inject custom logic directly into the network layer, which is managed by the Envoy proxy. This approach allowed for effective control and security of the traffic flowing to and from their Jupyter Notebooks. The key to the implementation of basic auth was the WasmPlugin, a feature of Istio that facilitates the deployment of WebAssembly modules within the Envoy proxy.
This basic authentication WasmPlugin is integrated into a sequence of network filters within the Envoy proxy. These filters enable the execution of higher-level functions related to access control, transformation, data enrichment, auditing, and more, thereby enhancing the overall security and functionality of the service mesh. Here’s a simplified version of the spec for adding basic auth filter to the Envoy filter chain:
apiVersion: extensions.istio.io/v1alpha1kind: WasmPluginmetadata: name: basic-auth namespace: istio-ingressspec: phase: AUTHN pluginConfig: basic_auth_rules: - credentials: - user:pass hosts: www.example.com prefix: /secret/ selector: matchLabels: istio: ingressgateway url: oci://ghcr.io/istio-ecosystem/wasm-extensions/basic_auth:1.12.0
For implementing OAuth2 in our notebooks, we utilized an Envoy filter, but the approach differed from that of basic authentication. Unlike the basic auth where we could conveniently insert a pre-built WasmPlugin into the filter chain, OAuth2 required a more tailored solution. To achieve this, we employed an HTTP filter specifically designed for OAuth. At TrueFoundry, our Single Sign-On system integrates with FusionAuth, serving as our OAuth provider.
Here’s how the Envoy Filter spec looks like – refer to the comments in the file for more details:
apiVersion: networking.istio.io/v1alpha3kind: EnvoyFiltermetadata: name: truefoundry-notebook-tfy-oauth2 # Name of the EnvoyFilter namespace: auth-test # Namespace where the EnvoyFilter is deployedspec: workloadSelector: labels: truefoundry.com/application: truefoundry-notebook # Selector targeting workloads with specific labels configPatches: - applyTo: CLUSTER match: context: SIDECAR_OUTBOUND patch: operation: ADD value: name: tfy-oauth2 # Name of the cluster for OAuth2 authentication service type: LOGICAL_DNS # Type of service discovery (DNS) connect_timeout: 5s # Timeout for establishing a connection lb_policy: ROUND_ROBIN # Load balancing policy # other load balancing config - applyTo: HTTP_FILTER match: context: SIDECAR_INBOUND listener: filterChain: filter: name: "envoy.filters.network.http_connection_manager" subFilter: name: envoy.filters.http.jwt_authn patch: operation: INSERT_BEFORE # Inserting this filter before the JWT auth filter value: name: envoy.filters.http.tfy-oauth # Name of the OAuth filter typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.oauth2.v3.OAuth2 config: use_refresh_token: false # Whether to use a refresh token pass_through_matcher: - name: Authorization present_match: true # Pass through if Authorization header is present forward_bearer_token: true # Forward bearer token to upstream auth_type: BASIC_AUTH # Type of authentication used token_endpoint: cluster: tfy-oauth2 # Cluster for token endpoint uri: <token-endpoint-uri-of-oauth-provider> timeout: 5s # Timeout for token endpoint authorization_endpoint: <authorization-endpoint-uri-of-oauth-provider> redirect_uri: https://%REQ(:authority)%/truefoundry-notebook/_auth/callback # Redirect URI for callback redirect_path_matcher: path: exact: /truefoundry-notebook/_auth/callback # Path for redirect URI signout_path: path: exact: /truefoundry-notebook/_auth/signout # Path for signout credentials: client_id: <client-id-for-oauth> token_secret: # configuration to fetch token secret # read more about how we fetch secrets here: # https://www.envoyproxy.io/docs/envoy/latest/configuration/security/secret hmac_secret: # configuration to fetch hmac
When a user attempts to access a service protected by the OAuth2 filter for the first time, they are redirected to the authorization_endpoint. This endpoint is the URL of our external OAuth Provider, which, in our implementation, is the FusionAuth-based TrueFoundry login modal. This redirection is a critical step in the OAuth process, guiding users to a secure location where they can authenticate and consequently grant the necessary permissions for access to the service.
authorization_endpoint
Once the login is complete, FusionAuth will redirect you to the redirect_uri (configured in the filter specification), adding a secret, temporary authorization code there. This request is intercepted by the filter and it makes a request to token_endpoint, exchanging the code for a JWT token. Finally, the filter sets cookies with the JWT token.
redirect_uri
token_endpoint
Subsequent accesses to the service are passed through the HTTP Filter since the cookie sets the Authorization header with JWT as the value. The filter is configured to pass through such requests (refer pass_through_matcher in the spec). To validate that the JWT is a valid token, we create a RequestAuthentication policy that will check with the OAuth provider:
Authorization
pass_through_matcher
apiVersion: security.istio.io/v1beta1kind: RequestAuthenticationmetadata: # ...spec: selector: # ... jwtRules: - issuer: "truefoundry.com" fromHeaders: - name: Authorization prefix: "Bearer " audiences: - <client-id> jwksUri: <oauth-provider-jwks-uri> forwardOriginalToken: true
Finally, we add the Authorization Policy that specify what requests to apply RequestAuthentication to. We want to apply authorization to all requests on port 8888:
RequestAuthentication
apiVersion: security.istio.io/v1beta1kind: AuthorizationPolicymetadata: name: best-notebook-tfy-oauth2 namespace: auth-testspec: selector: matchLabels: truefoundry.com/application: best-notebook action: DENY rules: - from: - source: notRequestPrincipals: ["*"] to: - operation: ports: - "8888"
Join AI/ML leaders for the latest on product, community, and GenAI developments