Skip to content

feat: add TLS certificate support for Docker contexts#3728

Open
Itx-Psycho0 wants to merge 2 commits into
knative:mainfrom
Itx-Psycho0:feat/docker-context-tls
Open

feat: add TLS certificate support for Docker contexts#3728
Itx-Psycho0 wants to merge 2 commits into
knative:mainfrom
Itx-Psycho0:feat/docker-context-tls

Conversation

@Itx-Psycho0
Copy link
Copy Markdown
Contributor

Description

Extends Docker context detection (from #3684) to support TLS certificates stored in Docker contexts. This enables secure connections to remote Docker daemons configured via Docker contexts.

Fixes #3719

Changes

Extended getDockerContextHost() to getDockerContextConfig() which returns both host and TLS configuration
Added DockerContextConfig struct to hold host and TLS settings
Load TLS certificates from context's tls/ directory
Write certificates to temp directory and configure via environment variables
Reuse existing TLS functionality via newHttpClient()
Added comprehensive test with mock TLS daemon

Benefits

Remote Docker setups work automatically with TLS
Consistent with Docker CLI behavior - if docker commands work, func commands work
No manual environment variables needed
Proper TLS support for secure connections

Testing

Added TestNewClient_DockerContextTLS with mock TLS-enabled daemon
All existing Docker tests pass
make check passes

Related

@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented May 14, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Itx-Psycho0
Once this PR has been reviewed and has the lgtm label, please assign dsimansk for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow Bot requested review from dsimansk and jrangelramos May 14, 2026 11:11
@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented May 14, 2026

Hi @Itx-Psycho0. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@knative-prow knative-prow Bot added size/L 🤖 PR changes 100-499 lines, ignoring generated files. needs-ok-to-test 🤖 Needs an org member to approve testing labels May 14, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 45.73643% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.07%. Comparing base (e435a89) to head (1277ea5).

Files with missing lines Patch % Lines
pkg/docker/docker_client.go 45.73% 64 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3728      +/-   ##
==========================================
+ Coverage   56.95%   57.07%   +0.12%     
==========================================
  Files         181      181              
  Lines       21116    21202      +86     
==========================================
+ Hits        12026    12101      +75     
- Misses       7866     7874       +8     
- Partials     1224     1227       +3     
Flag Coverage Δ
e2e 35.65% <0.77%> (-0.16%) ⬇️
e2e go 32.30% <1.00%> (-0.13%) ⬇️
e2e node 28.09% <1.00%> (-0.11%) ⬇️
e2e python 32.67% <1.00%> (-0.13%) ⬇️
e2e quarkus 28.23% <1.00%> (-0.11%) ⬇️
e2e rust 27.62% <1.00%> (?)
e2e springboot 26.14% <1.00%> (-0.12%) ⬇️
e2e typescript 28.20% <1.00%> (-0.13%) ⬇️
e2e-config-ci 17.63% <0.00%> (-0.08%) ⬇️
integration 17.22% <1.00%> (-0.08%) ⬇️
unit macos-14 44.89% <6.00%> (-0.17%) ⬇️
unit macos-latest 44.89% <6.00%> (-0.17%) ⬇️
unit ubuntu-24.04-arm 45.45% <44.96%> (+0.13%) ⬆️
unit ubuntu-latest 46.10% <42.00%> (+0.09%) ⬆️
unit windows-latest 44.95% <13.00%> (-0.14%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Extends Docker context detection (from knative#3684) to support TLS certificates
stored in Docker contexts. This enables secure connections to remote Docker
daemons configured via Docker contexts.

Changes:
- Extended getDockerContextHost() to getDockerContextConfig() which returns
  both host and TLS configuration
- Added DockerContextConfig struct to hold host and TLS settings
- Modified newHttpClient() to check Docker context first, then fall back
  to environment variables
- Added newHttpClientFromContext() to create HTTP client from context config
- Load TLS certificates directly from context (no temp files or env vars)
- Added comprehensive test with mock TLS daemon

Benefits:
- Remote Docker setups work automatically with TLS
- Consistent with Docker CLI behavior
- No manual environment variables needed
- Proper TLS support for secure connections
- Clean implementation without temp files

Fixes knative#3719
@Itx-Psycho0 Itx-Psycho0 force-pushed the feat/docker-context-tls branch from 5ecc50a to 1af6e75 Compare May 14, 2026 13:27
Comment thread pkg/docker/docker_client.go Fixed
@matejvasek
Copy link
Copy Markdown
Contributor

/ok-to-test

@knative-prow knative-prow Bot added ok-to-test 🤖 Non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test 🤖 Needs an org member to approve testing labels May 14, 2026
@matejvasek
Copy link
Copy Markdown
Contributor

Review: feat: add TLS certificate support for Docker contexts

Branch: Itx-Psycho0/feat/docker-context-tls
Commit: 1af6e757b
Files: pkg/docker/docker_client.go, pkg/docker/docker_client_test.go


Overall

The approach of building the TLS config in-memory via newHttpClientFromContext() is sound. The test with a real mTLS mock daemon is solid. There are a few issues that should be addressed before merging.


Significant

1. Wrong fallback TLS path

docker_client.gogetDockerContextConfig(), around the fallback block:

// Docker stores context TLS files in contexts/meta/<sha256-hash>/
hash := sha256.Sum256([]byte(contexts[0].Name))
tlsPath = filepath.Join(dockerConfigDir, "contexts", "meta", fmt.Sprintf("%x", hash))

Docker stores TLS files under contexts/tls/<hash>/docker/, not contexts/meta/<hash>/. The meta directory holds meta.json metadata, not certificates. The comment is also incorrect.

The test masks this because it always sets Storage.TLSPath explicitly, so the fallback is never exercised. A test that omits Storage.TLSPath (or sets it to "<IN MEMORY>") and places certs at the correct fallback location would catch this.

2. Precedence inversion: context TLS overrides explicit env vars

docker_client.gonewHttpClient():

func newHttpClient() *http.Client {
	// First, try to get TLS config from Docker context
	if contextConfig := getDockerContextConfig(); contextConfig != nil && len(contextConfig.TLSCert) > 0 && len(contextConfig.TLSKey) > 0 {
		return newHttpClientFromContext(contextConfig)
	}

	// Fall back to environment variables
	tlsVerifyStr, tlsVerifyChanged := os.LookupEnv("DOCKER_TLS_VERIFY")
	...

Context is checked before DOCKER_TLS_VERIFY. If someone explicitly sets DOCKER_TLS_VERIFY=0 to disable TLS, but a Docker context with certs exists, context TLS will be used and the env var is never consulted. Convention is that env vars override config files, not the other way around. The env var check should come first, and context TLS should only be tried when DOCKER_TLS_VERIFY is not set.

3. Context TLS applied even when DOCKER_HOST is set manually

newHttpClient() is called unconditionally for all TCP connections, including when DOCKER_HOST was explicitly set by the user (not from context detection). If someone sets DOCKER_HOST=tcp://my-host:2376 without DOCKER_TLS_VERIFY, the code will now silently pick up TLS certs from whatever Docker context happens to be active. This mixes two independent configuration sources in a surprising way. Context TLS should only activate when the host itself came from context detection.

4. docker context inspect is executed twice

NewClient() calls GetDockerContextHostFunc()getDockerContextHost()getDockerContextConfig() (first subprocess). Then inside the isTCP branch, newHttpClient() calls getDockerContextConfig() again (second subprocess). Each call forks the docker CLI. The config should be fetched once and reused.


Minor

5. DockerContextConfig is exported but internal-only

The struct is only used within the docker package. It should be unexported (dockerContextConfig).

6. Redundant DOCKER_CONFIG passthrough

if dockerConfig := os.Getenv("DOCKER_CONFIG"); dockerConfig != "" {
    cmd.Env = append(os.Environ(), "DOCKER_CONFIG="+dockerConfig)
}

When cmd.Env is nil, the child process inherits the parent's full environment, which already includes DOCKER_CONFIG. This block just adds a duplicate entry and can be deleted.

7. getDockerContextHost() wrapper does unnecessary disk I/O

The backward-compat wrapper calls getDockerContextConfig() which reads cert files from disk, only to discard everything except .Host. When only the host is needed (the common case during host detection in NewClient), this is wasted I/O.

8. Silent error on malformed cert/key

In newHttpClientFromContext():

cert, err := tls.X509KeyPair(contextConfig.TLSCert, contextConfig.TLSKey)
if err == nil {
    ...
}

If X509KeyPair fails (malformed cert/key), the error is silently swallowed and the client is returned without client certificate auth. This will produce a confusing TLS handshake failure at connection time instead of a clear error at setup time. Consider returning an error or at least logging a warning.

@Itx-Psycho0
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review @matejvasek! I see the issues, let me fix them:

  1. Wrong TLS path - will change to contexts/tls/<hash>/
  2. Precedence - will check env vars first, only use context if not set
  3. Will only apply context TLS when host came from context detection
  4. Will cache the config to avoid calling docker context inspect twice

Working on the fixes now!

Fixes based on @matejvasek's review:

Significant fixes:
1. Fixed TLS path: use contexts/tls/<hash>/ not contexts/meta/<hash>/
2. Fixed precedence: env vars (DOCKER_TLS_VERIFY) now override context
3. Context TLS only applies when host came from context detection
4. Cache context config to avoid calling 'docker context inspect' twice

Minor improvements:
5. Unexported dockerContextConfig (internal-only struct)
6. Removed redundant DOCKER_CONFIG passthrough (auto-inherited)
7. Added error logging for malformed certificates

The implementation now correctly:
- Checks DOCKER_TLS_VERIFY env var first
- Only uses context TLS when env vars are not set
- Only applies context TLS when host came from context
- Calls docker CLI once instead of twice
- Logs warnings for cert loading failures
Comment thread pkg/docker/docker_client.go Dismissed
@matejvasek
Copy link
Copy Markdown
Contributor

Review: feat: add TLS certificate support for Docker contexts

Branch: Itx-Psycho0/feat/docker-context-tls
Commits: 1af6e757b, 1277ea5df
Files: pkg/docker/docker_client.go, pkg/docker/docker_client_test.go


Overall

Good iteration. The core approach — building TLS config in-memory via newHttpClientFromContext(), caching context config to avoid double subprocess spawns, and respecting env var precedence — is sound. Most issues from the previous round have been addressed. Two items remain before this is mergeable.


What was fixed

  • Double subprocess spawn — fixed. contextConfig is fetched once in NewClient() and passed to newHttpClient(contextConfig).
  • Wrong fallback TLS path — fixed. Now uses contexts/tls/<hash>/ with a corrected comment.
  • Precedence inversion — fixed. newHttpClient() checks DOCKER_TLS_VERIFY first; context TLS is only tried when env vars are not set.
  • Context TLS applied when DOCKER_HOST set manually — fixed. contextConfig is only populated when the host came from context detection; when DOCKER_HOST is set via env var, it stays nil.
  • Exported type — fixed. dockerContextConfig is now unexported.
  • Redundant DOCKER_CONFIG passthrough — fixed. Replaced with a comment.
  • Silent error on bad cert/key — fixed. Now logs a warning to stderr.

Remaining issues

1. GetDockerContextHostFunc and getDockerContextHost() are now dead code

On upstream/main, NewClient() calls GetDockerContextHostFunc(). On this branch, NewClient() calls getDockerContextConfig() directly — GetDockerContextHostFunc is defined and exported but never called anywhere:

$ git grep 'GetDockerContextHostFunc' upstream/main
pkg/docker/docker_client.go: if contextHost := GetDockerContextHostFunc(); contextHost != "" {   # <-- used
pkg/docker/docker_client.go: var GetDockerContextHostFunc = getDockerContextHost

$ git grep 'GetDockerContextHostFunc' Itx-Psycho0/feat/docker-context-tls
pkg/docker/docker_client.go: var GetDockerContextHostFunc = getDockerContextHost                  # <-- defined but never called

This is a minor breaking change: any downstream code that mocked GetDockerContextHostFunc to inject test behavior will find their mock silently ignored. Two options:

  • Remove GetDockerContextHostFunc and getDockerContextHost() entirely if no external consumers depend on them.
  • Replace them with a GetDockerContextConfigFunc (or similar) mockable variable and use that in NewClient(), preserving the testability contract.

2. Test doesn't exercise the fallback TLS path

The fix to use contexts/tls/<hash>/ is correct, but the test always sets Storage.TLSPath explicitly via createDockerContextConfigWithTLS, so the fallback branch is never executed. Adding a test case where Storage.TLSPath is empty or "<IN MEMORY>" with certs placed at contexts/tls/<hash>/ would validate the fix actually works.


Nits

  • The warning on stderr (fmt.Fprintf(os.Stderr, "Warning: ...")) works but log.Printf would be more conventional Go. Not blocking.

@matejvasek
Copy link
Copy Markdown
Contributor

Also please add test for the previous TLS functionality using the envvars too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test 🤖 Non-member PR verified by an org member that is safe to test. size/L 🤖 PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TLS certificate support for Docker contexts

3 participants