Skip to content

Conversation

@joelanford
Copy link
Member

@joelanford joelanford commented Aug 30, 2024

Description

See https://docs.google.com/document/d/1s-nfpJqecWGuriFIVXr2REbZLUB8N8QxLp1MQKLPhlE/edit#heading=h.x3tfh25grvnv

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

@joelanford joelanford requested a review from a team as a code owner August 30, 2024 03:08
@netlify
Copy link

netlify bot commented Aug 30, 2024

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 41f0aaf
🔍 Latest deploy log https://app.netlify.com/sites/olmv1/deploys/66e0e49a75d7440008172f95
😎 Deploy Preview https://deploy-preview-1194--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@codecov
Copy link

codecov bot commented Aug 30, 2024

Codecov Report

Attention: Patch coverage is 74.64789% with 36 lines in your changes missing coverage. Please review.

Project coverage is 76.82%. Comparing base (d09f325) to head (41f0aaf).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
internal/rukpak/source/containers_image.go 72.30% 19 Missing and 17 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1194      +/-   ##
==========================================
+ Coverage   76.49%   76.82%   +0.32%     
==========================================
  Files          40       40              
  Lines        2336     2369      +33     
==========================================
+ Hits         1787     1820      +33     
+ Misses        392      383       -9     
- Partials      157      166       +9     
Flag Coverage Δ
e2e 58.21% <57.74%> (+0.63%) ⬆️
unit 52.93% <67.60%> (+0.53%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@joelanford joelanford force-pushed the containers-image branch 3 times, most recently from 8222c11 to 6558ab3 Compare August 30, 2024 14:11
@joelanford joelanford force-pushed the containers-image branch 7 times, most recently from 295179a to 7c8a6e0 Compare September 3, 2024 14:07
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 3, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 3, 2024
@joelanford joelanford force-pushed the containers-image branch 3 times, most recently from 0145e41 to df384cd Compare September 4, 2024 12:13
@LalatenduMohanty LalatenduMohanty self-requested a review September 4, 2024 14:16
everettraven
everettraven previously approved these changes Sep 4, 2024
}
unpacker := &source.ImageRegistry{
BaseCachePath: filepath.Join(cachePath, "unpack"),
CertPoolWatcher: certPoolWatcher,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loss of CertPoolWatcher: certPoolWatcher here...

Copy link
Contributor

@tmshort tmshort Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing ContainersImageRegistry use certificate, so is the certPoolWatcher even necessary?
And if that's the case, this could lead to more code being deleted... but presumably it's used somewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in BuildHTTPClient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the containers/image source implementation lets us set the CA files on a context struct, which I think are reloaded on every pull.

And yeah, the certPoolWatcher should still be used to connect to catalogd.

Comment on lines 206 to 208
defaultAuthConfigPath := filepath.Join("/etc", "containers", "auth.json")
if _, err := os.Stat(defaultAuthConfigPath); err == nil {
unpacker.SourceContext.AuthFilePath = defaultAuthConfigPath
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be a default for containers/image, interestingly. Researching this a bit more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primary read/write file for auth seems to be ${XDG_RUNTIME_DIR}/containers/auth.json or $HOME/.config/containers/auth.json: https://github.com/containers/image/blob/main/docs/containers-auth.json.5.md#description

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, we should probably revisit the RFC and highlight the default location for all the configs we may care about

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out that file location is just not a thing (yet at least): containers/image#1746

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could follow the pattern that podman and skopeo do. If REGISTRY_AUTH_FILE is set in the environment, we set its value on this field. I'll switch the PR to do that for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took all of the auth stuff back out of the PR for now. We should discuss and handle separately.

This PR is now focused on containers/image and use of /etc/containers system configuration (as an unsupported/subject-to-change internal API)

@joelanford joelanford force-pushed the containers-image branch 7 times, most recently from ee7fe25 to 1506cea Compare September 10, 2024 11:59
Copy link
Contributor

@tmshort tmshort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. Mostly surrounding the need for a new Issuer?

Makefile Outdated
Comment on lines 166 to 167


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra blank line.

Comment on lines +159 to +164
CGO_ENABLED=1 go test \
-tags '$(GO_BUILD_TAGS)' \
-cover -coverprofile ${ROOT_DIR}/coverage/unit.out \
-count=1 -race -short \
$(UNIT_TEST_DIRS) \
-test.gocoverdir=$(ROOT_DIR)/coverage/unit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if you want to go all out and put each flag on its own line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂️

I was mainly focused on "like flags" and "can I see it without side-to-side scrolling?"

Notes

  • I wanted UNIT_TEST_DIRS as close to the end as possible because I sometimes find myself copying/pasting the command line from a full run so that I can run unit tests in a specific package (and yes, I know we could make UNIT_TEST_DIRS overrideable, but in other projects that do that I always forget what the var name is, so I end up copy/pasting anyway)
  • I was forced to put test.govoerdir after the UNIT_TEST_DIRS because those flags are passed to the test binary.

test-e2e: KIND_CLUSTER_NAME := operator-controller-e2e
test-e2e: KUSTOMIZE_BUILD_DIR := config/overlays/e2e
test-e2e: GO_BUILD_FLAGS := -cover
test-e2e: E2E_REGISTRY_CERT_REF := Issuer/selfsigned-issuer
Copy link
Contributor

@tmshort tmshort Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ClusterIssuer/olmv1-ca? Did you really want to create another CA/Issuer? Note that we already have a self-signed-issuer in the cert-manager namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here is to setup our standard e2e test with:

  • an image registry the serves with an untrusted certificate
  • a mounted /etc/containers/registries.conf file that instructs operator-controller to allow insecure connections to the image registry.

This setup proves that our image puller respect what shows up in /etc/containers/registries.conf (if the registries.conf volume mount is removed, the tests fail)

The extension developer e2e and the upgrade e2e are left using the standard olmv1-ca to:

  1. Verify that op-con works without a mounted /etc/containers/registry.conf
  2. Avoid issues with pre-upgrade setup which depends on using the unmodified manifest of the previous release (meaning we can't mount /etc/containers to allow op-con to trust an insecure registry)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't love the current setup. It seems overly complex and brittle, and I do acknowledge that this PR makes it ever so slightly more complex and brittle.

I'd be in favor of looking at our e2e test code and fixtures holistically in order to do a bit of cleanup and refactoring, but IMO, in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the intent was to have a separate trust chain for the registries, then ok. The current CAs are used for communication catalogd <-> operator-controller, and api-server -> catalogd (for webhooks)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the intent is to explicitly have a separate (untrusted) trust chain. The other places the CA were used are:

  • catalogd -> e2e image registry
  • operator-controller -> e2e image registry

Prior to this PR, the same exact image registry setup was used for every e2e (e2e, upgrade-e2e, and developer-extension-e2e).

With this PR, the vanilla e2e is explicitly changed to use the untrusted self-signed cert (while the other e2e variants remain unchanged)

Comment on lines +36 to +43
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: selfsigned-issuer
namespace: ${namespace}
spec:
selfSigned: {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you want another issuer (a few comments on this)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +19 to +21
type Unrecoverable struct {
error
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an explicit check for this type of error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. It appears that we do not check for this error type in our reconciler. I brought this over from the now-deleted image_registry.go where it was used.

I do check for it in unit tests.

WDYT about addressing this in a follow-up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on how many other changes will occur if you update... but at least it's isolated.

Copy link
Contributor

@tmshort tmshort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 12, 2024
@LalatenduMohanty
Copy link
Member

LalatenduMohanty commented Sep 12, 2024

/hold Adding a hold as @everettraven and me can use few hours to review and test the PR . Will remove the hold before EOD today.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 12, 2024
Copy link
Member

@LalatenduMohanty LalatenduMohanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@LalatenduMohanty
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 12, 2024
@joelanford joelanford added this pull request to the merge queue Sep 12, 2024
Merged via the queue into operator-framework:main with commit 097d897 Sep 12, 2024
17 of 18 checks passed
@joelanford joelanford deleted the containers-image branch September 12, 2024 19:22
@skattoju skattoju mentioned this pull request Sep 25, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants