Skip to content

Conversation

markdroth
Copy link
Member

No description provided.

@markdroth markdroth marked this pull request as ready for review September 18, 2025 22:50
@markdroth markdroth requested a review from dfawley September 18, 2025 22:50
will be set to the local address of the connection that the request
came in on.
- [principal](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L82):
If TLS is used, this will be set to the server's cert's first URI SAN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls package in Go does not provide a way to retrieve the local certificate used in the TLS handshake. We had filed an issue for this with the Go team ages ago, but not much progress there: golang/go#24673

If the configured channel credentials contain a certificate provider though, we would be able to retrieve the local certs from the provider. But the provider could return more than one cert (if the server is serving multiple domain names and expecting the client to specify an SNI during the TLS handshake). Do other languages have a mechanism to retrieve the actual cert used for the handshake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C-core, we control the TLS code, so I think we'll be able to handle this.

I'm not sure how hard this will be in Java and Go. If we can't support it, I'm open to dropping it from the gRFC.

I'd like input from @ejona86, @dfawley, @matthewstevenson88, and @gtcooke94 on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is open PR to support this in Go: golang/go#75699. The changes seem very straightforward, but we might need some pushing to get this through since we originally asked for this about 6 years ago.

- [disallow_all](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L70)
- [allow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L75)
- [disallow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L79)
- [disallow_is_error](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L87)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The envoy docs say when this is true, and if the rules in this list cause a header mutation to be disallowed, then the filter using this configuration will terminate the request with a 500 error.

Does this mean Unavailable for us or Unknown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the normal HTTP-to-gRPC status mapping, an HTTP 500 status would map to UNKNOWN.

I could also see an argument for using INTERNAL here. I don't feel strongly, but maybe @ejona86 or @dfawley have thoughts on this.


The following server-side metrics will be exported:

| Name | Type | Unit | Labels | Description |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add the data plane RPC method name as a label to the server-side metrics? Or would that be considered a risk for cardinality explosion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that might make sense. But if we do that, then we probably also need to apply the same filtering as described in A66 for per-call metrics, so that we don't have a cardinality explosion for non-generated method names.

This makes me realize that these are really per-call metrics, but we'd be defining them via non-per-call metrics framework. @ctiller, are we okay with that for C-core performance-wise, given the work we're doing to move collection into C-core instead of in the stats plugins, or do we need to consider alternatives here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants