Conversation
| TOKEN_PORT=4100 | ||
| KMS_SERVER_CONFIG_FILE="config.toml" | ||
| AWS_WEB_IDENTITY_TOKEN_FILE="token" | ||
| KEEPIDLE=30 |
There was a problem hiding this comment.
As (in)convenient as it is, one should avoid hardcoding configuration parameters in the enclave init script. There can be rare exceptions (like the ports for the logger and for the token reader) if they're critical for starting the application and aren't supposed to ever change, but keepalive parameters aren't among them, IMHO.
The reason to not do that is that whenever we want to change these parameters, we'll have to rebuild and redeploy the enclave image. It takes at least an hour in our current workflow.
I can accept hardcoded values for testing purposes but if they stay for longer than validating a solution, we have to extract them from the config file with yq in the init script.
There was a problem hiding this comment.
agreed. makes sense to make them configurable somehow. possibly via ENV vars or arguments to the script
There was a problem hiding this comment.
Is the right place for these parameters in the KMS config.toml file then ?
There was a problem hiding this comment.
Yes, and you already added tcp_keep_alive there.
There was a problem hiding this comment.
Indeed, but it works OK because these are expected fields on the KMS app.
Afaict, I can't add a section for socat in the toml file as KMS won't parse it correctly (unless ofc we had a socat struct in our KMS conf?)
There was a problem hiding this comment.
I don't think we need a separate section for socat, we can just use the same values we set for kms-server. Is there a good reason to be able to configure the socat keepalive settings separately from the kms-server keepalive settings?
There was a problem hiding this comment.
For KEEPIDLE and KEEPINTVL we could map them to tcp_keep_alive_secs and http2_keep_alive_interval_secs respectively , but there's no equivalent to KEEPCNT (tbh I've also no clue what a sensible value for this would be, so I'd be fine with leaving it default; whatever default is)
Consolidated Tests Results 2025-12-12 - 10:30:55Test ResultsDetails
test-reporter: Run #360
🎉 All tests passed!TestsView All Tests
🍂 No flaky tests in this run. Github Test Reporter by CTRF 💚 🔄 This comment has been updated |
7d9f4ce to
8330749
Compare
|
Note that I removed the 60s timeout on the socat process, as my hope is that the keepalive is better suited for this job (and if we can, I guess we'd rather keep the same socat process than creating a new one?) |
|
This is slightly outdated and is only needed for the enclave proxies. |
Description of changes
Adds keepalive to the gRPC connections (both client and server) as well as on the socat proxies.
Note that we may want to also add keepalive to the gRPC client in the connector ?
Issue ticket number and link
Shot at https://github.com/zama-ai/kms-internal/issues/2835
PR Checklist
I attest that all checked items are satisfied. Any deviation is clearly justified above.
chore: ...).TODO(#issue).unwrap/expect/paniconly in tests or for invariant bugs (documented if present).devopslabel + infra notified + infra-team reviewer assigned.!and affected teams notified.Zeroize+ZeroizeOnDropimplemented.unsafe; if unavoidable: minimal, justified, documented, and test/fuzz covered.Dependency Update Questionnaire (only if deps changed or added)
Answer in the
Cargo.tomlnext to the dependency (or here if updating):More details and explanations for the checklist and dependency updates can be found in CONTRIBUTING.md