fix: Presto spark add X509 certificates to SessionContext#27183
fix: Presto spark add X509 certificates to SessionContext#27183kevintang2022 wants to merge 1 commit intoprestodb:masterfrom
Conversation
Summary: For authentication, the certiifcates in the request should be accessible Differential Revision: D93777694
Reviewer's GuidePropagates X509 client certificates from Presto Session/Identity through the Presto Spark session/launcher/runner plumbing so they are available in PrestoSparkSessionContext, while adding corresponding fields, constructors, builders, and accessors. Sequence diagram for X509Certificate propagation from launcher to session contextsequenceDiagram
participant LauncherCommand as PrestoSparkLauncherCommand
participant Runner as PrestoSparkRunner
participant RunnerContext as PrestoSparkRunnerContext
participant Factory as IPrestoSparkQueryExecutionFactory
participant Session as PrestoSparkSession
participant SessionContext as PrestoSparkSessionContext
participant Id as Identity
LauncherCommand->>Runner: run(sql, user, principal, extraCredentials, certificates, catalog, schema, source, ...)
Runner->>RunnerContext: new PrestoSparkRunnerContext(user, principal, extraCredentials, certificates, catalog, schema, source, ...)
RunnerContext-->>Runner: PrestoSparkRunnerContext
Runner->>Factory: createSession(user, principal, extraCredentials, RunnerContext.getCertificates(), catalog, schema, source, ...)
Factory-->>Session: PrestoSparkSession(user, principal, extraCredentials, certificates, catalog, schema, source, ...)
Session-->>SessionContext: used in createFromSessionInfo(sessionInfo, Session)
SessionContext->>Session: getCertificates()
Session-->>SessionContext: certificates
SessionContext->>Id: new Identity(..., certificates)
Id-->>SessionContext: Identity
SessionContext->>SessionContext: getCertificates()
SessionContext->>Id: getCertificates()
Id-->>SessionContext: certificates
Class diagram for X509Certificate propagation in Presto Spark session plumbingclassDiagram
class PrestoSparkSession {
- String user
- Optional_Principal principal
- Map_String_String extraCredentials
- List_X509Certificate certificates
- Optional_String catalog
- Optional_String schema
- Optional_String source
PrestoSparkSession(String user, Optional_Principal principal, Map_String_String extraCredentials, List_X509Certificate certificates, Optional_String catalog, Optional_String schema, Optional_String source, ...)
Map_String_String getExtraCredentials()
List_X509Certificate getCertificates()
Optional_String getCatalog()
Optional_String getSchema()
Optional_String getSource()
}
class PrestoSparkRunnerContext {
- String user
- Optional_Principal principal
- Map_String_String extraCredentials
- List_X509Certificate certificates
- String catalog
- String schema
- Optional_String source
- ... otherFields
PrestoSparkRunnerContext(String user, Optional_Principal principal, Map_String_String extraCredentials, List_X509Certificate certificates, String catalog, String schema, Optional_String source, ...)
String getUser()
Optional_Principal getPrincipal()
Map_String_String getExtraCredentials()
List_X509Certificate getCertificates()
String getCatalog()
String getSchema()
Optional_String getSource()
}
class PrestoSparkRunnerContext_Builder {
- String user
- Optional_Principal principal
- Map_String_String extraCredentials
- List_X509Certificate certificates
- String catalog
- String schema
- Optional_String source
- ... otherFields
PrestoSparkRunnerContext_Builder(PrestoSparkRunnerContext prestoSparkRunnerContext)
PrestoSparkRunnerContext build()
}
class PrestoSparkRunner {
+ void run(String sql, String user, Optional_Principal principal, Map_String_String extraCredentials, List_X509Certificate certificates, String catalog, String schema, Optional_String source, ...)
- void execute(IPrestoSparkQueryExecutionFactory queryExecutionFactory, PrestoSparkRunnerContext prestoSparkRunnerContext)
}
class PrestoSparkSessionContext {
- Identity identity
+ static PrestoSparkSessionContext createFromSessionInfo(SessionInfo sessionInfo, PrestoSparkSession prestoSparkSession)
+ Identity getIdentity()
+ List_X509Certificate getCertificates()
+ String getCatalog()
+ String getSchema()
+ String getSource()
}
class Identity {
- List_X509Certificate certificates
+ List_X509Certificate getCertificates()
}
class PrestoSparkLauncherCommand {
+ void run()
}
class IPrestoSparkQueryExecutionFactory {
+ PrestoSparkSession createSession(String user, Optional_Principal principal, Map_String_String extraCredentials, List_X509Certificate certificates, Optional_String catalog, Optional_String schema, Optional_String source, ...)
}
PrestoSparkRunnerContext_Builder --> PrestoSparkRunnerContext : builds
PrestoSparkRunner --> PrestoSparkRunnerContext : creates
PrestoSparkRunner --> IPrestoSparkQueryExecutionFactory : uses
IPrestoSparkQueryExecutionFactory --> PrestoSparkSession : creates
PrestoSparkSessionContext --> PrestoSparkSession : readsCertificates
PrestoSparkSessionContext --> Identity : composes
Identity --> X509Certificate : contains
PrestoSparkLauncherCommand --> PrestoSparkRunner : callsRun
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- PrestoSparkRunnerContext stores the certificates list directly without copying or wrapping it, unlike PrestoSparkSession which defensively copies and wraps as unmodifiable; consider defensive copying here as well to avoid external mutation of the context.
- PrestoSparkRunnerContext.Builder does not initialize
certificateswith a default value, so callers that don’t explicitly set it may end up passing null into the PrestoSparkRunnerContext constructor; consider defaulting this to an empty list or enforcing non-null in the builder.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- PrestoSparkRunnerContext stores the certificates list directly without copying or wrapping it, unlike PrestoSparkSession which defensively copies and wraps as unmodifiable; consider defensive copying here as well to avoid external mutation of the context.
- PrestoSparkRunnerContext.Builder does not initialize `certificates` with a default value, so callers that don’t explicitly set it may end up passing null into the PrestoSparkRunnerContext constructor; consider defaulting this to an empty list or enforcing non-null in the builder.
## Individual Comments
### Comment 1
<location> `presto-spark-base/src/test/java/com/facebook/presto/spark/PrestoSparkQueryRunner.java:662` </location>
<code_context>
session.getIdentity().getUser(),
session.getIdentity().getPrincipal(),
session.getIdentity().getExtraCredentials(),
+ session.getIdentity().getCertificates(),
session.getCatalog(),
session.getSchema(),
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test that verifies X509 certificates are propagated end-to-end into the Spark session/context
The new `getCertificates()` plumbing is threaded through `PrestoSparkSession` and `PrestoSparkSessionContext`, but there’s no test confirming that a `Session` with non-empty certificates yields the same certificates on the Spark side (e.g., via `PrestoSparkSessionContext.getIdentity().getCertificates()` or whatever surface is used during Spark query execution). Please add an integration/e2e test that:
- Builds a `Session`/`Identity` with one or more test `X509Certificate` instances.
- Runs a simple Spark query via the existing Presto Spark test harness (e.g., `PrestoSparkQueryRunner`).
- Asserts the certificates are visible in the Spark execution path via the appropriate context.
This validates that the new field is not only set at construction but is actually available where authentication will rely on it.
Suggested implementation:
```java
session.getIdentity().getExtraCredentials(),
session.getIdentity().getCertificates(),
session.getCatalog(),
session.getSchema(),
session.getSource(),
@Test
public void testX509CertificatesPropagatedToSparkSessionContext()
throws Exception
{
// Arrange: build a Session with a non-empty certificates list
X509Certificate certificate = mock(X509Certificate.class);
List<X509Certificate> certificates = ImmutableList.of(certificate);
Session sessionWithCertificates = createSessionBuilder()
.setIdentity(
Identity.forUser("test_user")
.withPrincipal("test_principal")
.withCertificates(certificates)
.build())
.build();
// Act: run a simple Spark query using the existing Presto Spark test harness
PrestoSparkQueryRunner queryRunner = createPrestoSparkQueryRunner();
MaterializedResult result = queryRunner.execute(sessionWithCertificates, "SELECT 1");
// Sanity check query executed successfully
assertEquals(result.getOnlyValue(), 1L);
// Assert: the certificates are visible from the Spark execution context
PrestoSparkSessionContext sparkSessionContext = queryRunner.getLastSparkSessionContext();
assertNotNull(sparkSessionContext, "Spark session context should be initialized");
assertEquals(
sparkSessionContext.getIdentity().getCertificates(),
certificates,
"X509 certificates should be propagated end-to-end into the Spark session context");
}
```
To make the above compile and behave as intended, you will also need to:
1. **Add imports** at the top of this file (or adjust existing ones) for:
- `org.testng.annotations.Test` (or JUnit equivalent, depending on the rest of the file).
- `java.security.cert.X509Certificate`
- `java.util.List`
- `com.google.common.collect.ImmutableList`
- `static org.testng.Assert.assertEquals`
- `static org.testng.Assert.assertNotNull`
- `static org.mockito.Mockito.mock` (if Mockito is already used in the test suite)
- `com.facebook.presto.Session`
- `com.facebook.presto.spi.security.Identity`
- `com.facebook.presto.testing.MaterializedResult`
- `com.facebook.presto.spark.classloader_interface.PrestoSparkSessionContext` (or the correct package for `PrestoSparkSessionContext` in your tree)
2. **Provide or reuse helpers referenced in the test**:
- `createSessionBuilder()` – if not already present, this should return a `Session.SessionBuilder` configured consistently with other tests in this class (e.g., using an existing `SessionPropertyManager` and default catalog/schema).
- `createPrestoSparkQueryRunner()` – if this file is itself the query runner, replace this call with the appropriate factory/constructor used elsewhere (for example `new PrestoSparkQueryRunner(...)` or a static factory).
- `getLastSparkSessionContext()` – you will need to expose the last `PrestoSparkSessionContext` used during query execution from `PrestoSparkQueryRunner`. A typical pattern is:
- Add a `volatile PrestoSparkSessionContext lastSparkSessionContext;` field.
- In the code path where you construct the `PrestoSparkSessionContext` (the block containing the `session.getIdentity().getUser()`, `getPrincipal()`, `getExtraCredentials()`, `getCertificates()`, etc.), assign it to `lastSparkSessionContext`.
- Implement `public PrestoSparkSessionContext getLastSparkSessionContext()` that returns this field.
- If the class is not currently a TestNG/JUnit test class, either:
- Move the new test method into an appropriate `Test*` class that has access to the query runner, or
- Annotate this class as a test class following your existing test framework conventions.
3. **Align Identity API calls**:
- The example uses `Identity.forUser(...).withPrincipal(...).withCertificates(...).build()`. Adjust to the actual API of `Identity` in your codebase (e.g., use a builder or constructor you already rely on; the key requirement is that the resulting `Identity` returns the `certificates` list from `getCertificates()`).
4. **Align execute signature and result assertion**:
- If `PrestoSparkQueryRunner.execute(...)` has a different signature or return type than `MaterializedResult`, adjust the test to match (the objective is only to run a simple query and verify the session context, so any trivial query is fine).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| session.getIdentity().getUser(), | ||
| session.getIdentity().getPrincipal(), | ||
| session.getIdentity().getExtraCredentials(), | ||
| session.getIdentity().getCertificates(), |
There was a problem hiding this comment.
suggestion (testing): Add a test that verifies X509 certificates are propagated end-to-end into the Spark session/context
The new getCertificates() plumbing is threaded through PrestoSparkSession and PrestoSparkSessionContext, but there’s no test confirming that a Session with non-empty certificates yields the same certificates on the Spark side (e.g., via PrestoSparkSessionContext.getIdentity().getCertificates() or whatever surface is used during Spark query execution). Please add an integration/e2e test that:
- Builds a
Session/Identitywith one or more testX509Certificateinstances. - Runs a simple Spark query via the existing Presto Spark test harness (e.g.,
PrestoSparkQueryRunner). - Asserts the certificates are visible in the Spark execution path via the appropriate context.
This validates that the new field is not only set at construction but is actually available where authentication will rely on it.
Suggested implementation:
session.getIdentity().getExtraCredentials(),
session.getIdentity().getCertificates(),
session.getCatalog(),
session.getSchema(),
session.getSource(),
@Test
public void testX509CertificatesPropagatedToSparkSessionContext()
throws Exception
{
// Arrange: build a Session with a non-empty certificates list
X509Certificate certificate = mock(X509Certificate.class);
List<X509Certificate> certificates = ImmutableList.of(certificate);
Session sessionWithCertificates = createSessionBuilder()
.setIdentity(
Identity.forUser("test_user")
.withPrincipal("test_principal")
.withCertificates(certificates)
.build())
.build();
// Act: run a simple Spark query using the existing Presto Spark test harness
PrestoSparkQueryRunner queryRunner = createPrestoSparkQueryRunner();
MaterializedResult result = queryRunner.execute(sessionWithCertificates, "SELECT 1");
// Sanity check query executed successfully
assertEquals(result.getOnlyValue(), 1L);
// Assert: the certificates are visible from the Spark execution context
PrestoSparkSessionContext sparkSessionContext = queryRunner.getLastSparkSessionContext();
assertNotNull(sparkSessionContext, "Spark session context should be initialized");
assertEquals(
sparkSessionContext.getIdentity().getCertificates(),
certificates,
"X509 certificates should be propagated end-to-end into the Spark session context");
}To make the above compile and behave as intended, you will also need to:
-
Add imports at the top of this file (or adjust existing ones) for:
org.testng.annotations.Test(or JUnit equivalent, depending on the rest of the file).java.security.cert.X509Certificatejava.util.Listcom.google.common.collect.ImmutableListstatic org.testng.Assert.assertEqualsstatic org.testng.Assert.assertNotNullstatic org.mockito.Mockito.mock(if Mockito is already used in the test suite)com.facebook.presto.Sessioncom.facebook.presto.spi.security.Identitycom.facebook.presto.testing.MaterializedResultcom.facebook.presto.spark.classloader_interface.PrestoSparkSessionContext(or the correct package forPrestoSparkSessionContextin your tree)
-
Provide or reuse helpers referenced in the test:
createSessionBuilder()– if not already present, this should return aSession.SessionBuilderconfigured consistently with other tests in this class (e.g., using an existingSessionPropertyManagerand default catalog/schema).createPrestoSparkQueryRunner()– if this file is itself the query runner, replace this call with the appropriate factory/constructor used elsewhere (for examplenew PrestoSparkQueryRunner(...)or a static factory).getLastSparkSessionContext()– you will need to expose the lastPrestoSparkSessionContextused during query execution fromPrestoSparkQueryRunner. A typical pattern is:- Add a
volatile PrestoSparkSessionContext lastSparkSessionContext;field. - In the code path where you construct the
PrestoSparkSessionContext(the block containing thesession.getIdentity().getUser(),getPrincipal(),getExtraCredentials(),getCertificates(), etc.), assign it tolastSparkSessionContext. - Implement
public PrestoSparkSessionContext getLastSparkSessionContext()that returns this field.
- Add a
- If the class is not currently a TestNG/JUnit test class, either:
- Move the new test method into an appropriate
Test*class that has access to the query runner, or - Annotate this class as a test class following your existing test framework conventions.
- Move the new test method into an appropriate
-
Align Identity API calls:
- The example uses
Identity.forUser(...).withPrincipal(...).withCertificates(...).build(). Adjust to the actual API ofIdentityin your codebase (e.g., use a builder or constructor you already rely on; the key requirement is that the resultingIdentityreturns thecertificateslist fromgetCertificates()).
- The example uses
-
Align execute signature and result assertion:
- If
PrestoSparkQueryRunner.execute(...)has a different signature or return type thanMaterializedResult, adjust the test to match (the objective is only to run a simple query and verify the session context, so any trivial query is fine).
- If
| @Override | ||
| public List<X509Certificate> getCertificates() | ||
| { | ||
| return identity.getCertificates(); |
There was a problem hiding this comment.
It can never be null. The getCertificates() method on Identity class always returns a list
There was a problem hiding this comment.
PrestoSparkSessionContext constructor also require non null Identity
Description
The X509 certificates are available from the Session of the query, so it can also be passed to the PrestoSparkSession, PrestoSparkRunnerContext, and PrestoSparkSessionContext so that they can be stored as a field.
Motivation and Context
Compared to HttpRequestSessionContext, PrestoSparkSessionContext does not have the certificates, and it always returns an empty list because it uses the default implementation of the SessionContext interface.
Impact
Adds extra field to:
PrestoSparkSessionContext, PrestoSparkRunnerContext, PrestoSparkSession
Test Plan
No impact on logic, and the field was accessible in the SessionContext when running PrestoSpark
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
Differential Revision: D93777694