Skip to content

Conversation

dimpavloff
Copy link

@dimpavloff dimpavloff commented Jul 7, 2025

Part one for grpc/proposal#492 (A97).
This is done in a new credentials/jwt package to provide file-based PerRPCCallCredentials. It can be used beyond XDS. The package handles token reloading, caching, and validation as per A97 .

There will be a separate PR which uses it in xds/bootstrap.

Whilst implementing the above, I considered credentials/oauth and credentials/xds packages instead of creating a new one. The former package has NewJWTAccessFromKey and jwtAccess which seem very relevant at first. However, I think the jwtAccess behaviour seems more tailored towards Google services. Also, the refresh, caching, and error behaviour for A97 is quite different than what's already there and therefore a separate implementation would have still made sense.
WRT credentials/xds, it could have been extended to both handle transport and call credentials. However, this is a bit at odds with A97 which says that the implementation should be non-XDS specific and, from reading between the lines, usable beyond XDS.
I think the current approach makes review easier but because of the similarities with the other two packages, it is a bit confusing to navigate. Please let me know whether the structure should change.

Relates to istio/istio#53532

RELEASE NOTES:

  • credentials: add credentials/jwt package providing file-based JWT PerRPCCredentials (A97)

Copy link

codecov bot commented Jul 7, 2025

Codecov Report

❌ Patch coverage is 96.80000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.13%. Comparing base (dd718e4) to head (774d83e).
⚠️ Report is 58 commits behind head on master.

Files with missing lines Patch % Lines
credentials/jwt/jwt_token_file_call_creds.go 95.18% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8431      +/-   ##
==========================================
- Coverage   82.27%   82.13%   -0.14%     
==========================================
  Files         414      415       +1     
  Lines       40424    40648     +224     
==========================================
+ Hits        33259    33387     +128     
- Misses       5795     5883      +88     
- Partials     1370     1378       +8     
Files with missing lines Coverage Δ
credentials/jwt/jwt_file_reader.go 100.00% <100.00%> (ø)
credentials/jwt/jwt_token_file_call_creds.go 95.18% <95.18%> (ø)

... and 260 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dimpavloff dimpavloff changed the title xds: implement file-based JWT authentication (A97) xds: implement file-based JWT Call Credentials (A97) Jul 7, 2025
@dimpavloff
Copy link
Author

@dfawley hey 👋 Given you approved A97, would you mind having a cursory look at the PR to confirm if at least at a high level the approach looks good?

@eshitachandwani
Copy link
Member

I will take a look at this , I need to go through the gRFC first.

@dfawley dfawley self-assigned this Jul 22, 2025
@dfawley dfawley requested review from easwars and eshitachandwani and removed request for dfawley July 25, 2025 20:39
@dfawley dfawley assigned easwars and unassigned dfawley Jul 25, 2025
@dfawley
Copy link
Member

dfawley commented Jul 25, 2025

Sorry for the delay here.

@easwars would you be able to review this change? I think you have more background into some of the things than I do, like the bootstrap integration. Thank you!

@easwars
Copy link
Contributor

easwars commented Jul 28, 2025

Thank you for your contribution @dimpavloff. Yes, it would be nice if you can split this into smaller PRs. I will continue to use this PR to review the JWT call credentials implementation. If you can move the xDS implementation out to one or more PRs, I would greatly appreciate that and would be happy to review them as well.


// Verify cached expiration is 30 seconds before actual token expiration
impl := creds.(*jwtTokenFileCallCreds)
impl.mu.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only test the API surface. Relying on implementation internals in tests makes them brittle and would result in test changes when any changes to implementation is made.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you are referring to using a private field rather than obtaining mu specifically.
In general I agree -- white box tests may get fragile and break during a refactor. However, this test and the next couple of ones are about the caching behaviour -- it is meant to be transparent to the external API. If I don't make assertions about the private fields, the tests may pass trivially and become more flaky (e.g. when testing the backoff in the next test).
One alternative could be factoring out these behaviours out into a separate private struct with "public" functions which expose the same information. Given that it would require shifting the majority of the implementation into that struct, I'm not sure it is an improvement from the current approach.
Please do let me know your thoughts and if you have other suggestions.

@easwars easwars assigned dimpavloff and unassigned easwars Jul 28, 2025
Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good. Most of the comments are nits.

I still have to review some of the tests in credentials/jwt/jwt_token_file_call_creds_test.go though (starting with TestTokenFileCallCreds_CacheExpirationIsBeforeTokenExpiration).

if err == nil {
t.Fatalf("GetRequestMetadata() expected error, got nil")
}
if !strings.Contains(err.Error(), tt.wantErrContains) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, we should be checking for the status code instead of the actual contents of the error string.

Also, using status.Code(err) would work with nil error and return status.OK which would make the test logic much simpler as it would mostly the same for success and failure cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice point, updated.

Btw, this has helped uncover that credentials.CheckSecurityLevel and its call in GetRequestMetadata doesn't return a grpc code. This is also true in credentials/oauth/oauth.go and credentials/sts/sts.go. Therefore, this will map to codes.Unknown is this appropriate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will check with folks on the team and will get back to you soon on this. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chatted with @dfawley and we definitely don't want to use strings to determine what error is being returned. We have two options:

  • Define a new error type that is returned by the jWTFileReader that allows callers to inspect it and determine how to proceed, or,
  • Return a grpc status error

Given that the jWTFileReader is not a generic JWT file reader that can be used outside of this package, we feel that returning a grpc status error is totally fine (and thereby moving the logic of deciding what status code to return based on the type of error encountered into the file reader). But we are fine with either of the above two approaches.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops .. the above commented was intended for the other thread about returning error values from the reader.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, this has helped uncover that credentials.CheckSecurityLevel and its call in GetRequestMetadata doesn't return a grpc code. This is also true in credentials/oauth/oauth.go and credentials/sts/sts.go. Therefore, this will map to codes.Unknown is this appropriate?

Where does this map to codes.Unknown?

Returning non grpc status errors from GetRequestMetadata is fine. The transport will convert them into Unauthenticated. See here: https://github.com/grpc/grpc-go/blob/master/internal/transport/http2_client.go#L691

So, it should be fine for this call creds implementation to return a non grpc status error when the transport creds does not support credentials.PrivacyAndIntegrity.

Hope that helps. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: Define a new error type that is returned by the jWTFileReader that allows callers to inspect it and determine how to proceed, or,

I would prefer something more like

var someErrorKindA = errors.New("...")
var someErrorKindB = errors.New("...")

return someErrorKindA;  // or fmt.Errorf("%w: some explanatory text", someErrorKindA)

if err == someErrorKindA { ... } // or errors.Is(someErrorKindA)

instead of a new type, i.e.

type MyError struct {
  ....
}
func (MyError) Error() string {...}

return MyError{...}

if err.(*MyError) // or errors.As(...) if wrapping

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this map to codes.Unknown?

Returning non grpc status errors from GetRequestMetadata is fine. The transport will convert them into Unauthenticated. See here: https://github.com/grpc/grpc-go/blob/master/internal/transport/http2_client.go#L691

Ah, I wasn't aware of this. Then it should be fine.

I'll look into the errors, thanks for your inputs!

@easwars easwars assigned dimpavloff and unassigned easwars Aug 25, 2025
@dimpavloff dimpavloff requested a review from easwars August 26, 2025 17:52
@dimpavloff dimpavloff removed their assignment Aug 26, 2025
@easwars easwars self-assigned this Aug 29, 2025
Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of all of my comments. I think I'm mostly happy with where we are at now. I'll follow up on the two open issues and get back to you soon.

}
}

// createTestJWT creates a test JWT token with the specified audience and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove mention of audience from the comment.

}
}

// ReadToken reads and parses a JWT token from the configured file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will check with folks on the team and will get back to you soon on this. Thanks.

if err == nil {
t.Fatalf("GetRequestMetadata() expected error, got nil")
}
if !strings.Contains(err.Error(), tt.wantErrContains) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will check with folks on the team and will get back to you soon on this. Thanks.

@easwars easwars assigned dimpavloff and unassigned easwars Aug 29, 2025
@dimpavloff dimpavloff requested a review from easwars September 1, 2025 10:25
@dimpavloff dimpavloff removed their assignment Sep 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Auth Includes regular credentials API and implementation. Also includes advancedtls, authz, rbac etc. Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants