-
Notifications
You must be signed in to change notification settings - Fork 219
[RFC] Cloud Controller Blobstore Type: storage-cli #1253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
33db2cd
to
0941f6e
Compare
0941f6e
to
8fc25e9
Compare
I like the proposed solution. No issues from me. |
|
||
- create a new "Storage CLI" area | ||
- add existing approvers of areas "VM deployment lifecycle (BOSH)" (FI) and "CAPI" (ARI) as initial approvers to this new area | ||
- create a new repository `storage-cli` in this area with the goal to consolidate all existing bosh storage CLIs here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be storage-clis
since it will host an arbitrary set of them, or is this proposal that there is one single storage-cli
which is the interface to all supported backing blobstores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that there will be one storage-cli
which supports different providers. Which provider is used could be determined via the config file which needs to be provided as a cli argument.
E.g. for azure:
{
"provider": "azure"
"account_name": "<string> (required)",
"account_key": "<string> (required)",
"container_name": "<string> (required)",
"environment": "<string> (optional, default: 'AzureCloud')",
}
So the provider
field would be required and all other fields would be provider specific. S3 for example has completely different parameters (example).
The format of the config can be discussed after the RFC was accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This idea makes sense.
I think it would be helpful for this to be a bit more structured so that provider
determines the shape of a sub-section (maybe named credential
), leaving the top level with keys that are consistent across providers (e.g. "bucket name")
For folks perusing here are the various configs"
- https://github.com/cloudfoundry/bosh-azure-storage-cli/blob/main/config/config.go#L29-L34
- https://github.com/cloudfoundry/bosh-gcscli/blob/main/config/config.go#L28-L51
- https://github.com/cloudfoundry/bosh-ali-storage-cli/blob/main/config/config.go#L8-L13
- https://github.com/cloudfoundry/bosh-s3cli/blob/main/config/config.go#L12-L32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like the goal is a wholesale replacement of the existing four CLIs with a single unified CLI.
I am a little worried about this becoming a complete re-write, and needing to integrate this into the various bosh codebases in one big cut-over.
If we take the approach of consolidating into a single binary with an updated interface it makes #1253 (comment) less relevant.
The git history of the individual repose should be preserved however the consolidation takes place though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a lot of value in having the clis be distinct and simple vs trying to cram them all together with lots of different flags and configs. I do support consolidating the repos for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkoster brought up the idea to consolidate the CLIs in one repo. Arguments were:
- code reuse: cmd line parsing, config handling (the blobstore access itself probably won't allow for any reuse)
- easy refactoring and consistency across all CLIs
Having old CLIs and new CLI side by side for some time decouples development for good and for bad. From CC point of view we would start the implementation with Azure until it runs in production. This ensures that we get the interface and needed blobstore operations right (e.g. the POC allows resource matching from functional point of view but likely it won't perform so we may need a CLI command that supports resource matching for a list of blobs). Once Azure CLI stabilised, we would go for the remaining IaaS. At least this was the plan.
I don't see a blocker for adding the new blobstore cmds in the existing bosh-azure-storage-cli in a compatible way that doesn't break bosh. I understood that existing CI would check this compatibility which is good.
I'm open for differnet ways forward whatever fits best:
-
new repo, copy bosh-azure-storage-cli, extend for CC, same for other IaaS, bosh switches to new storage-cli when ready
- the approach currently proposed in the RFC
- development of CC commands is decoupled, no risk on bosh
- higher effort for later migration from old to new CLI in bosh, risk that incompatibilities sneak in unnoticed (could be mitigated by additonional CI validation in bosh)
-
new repo, consolidate first, then extend for CC
- consolidate CLIs first into a new
storage-cli[s]
repo (e.g. using the approach proposed below) without adding new commands needed by CC, bosh switches to the new, consolidated CLI asap - development of CC commands gets initially delayed a bit but might benefit from the now possible code reuse and easier refactoring
- no late surprises
- consolidate CLIs first into a new
-
no new repo, add new cmds in existing CLIs
- this is the approach we used for our POC: POC: bosh-azure-storage-cli based blobstore client cloud_controller_ng#4397 and https://github.com/johha/bosh-azure-storage-cli/tree/capi
- no late surprises as existing CLIs are validated in CI
- more code duplication, slight risk for inconsistencies between IaaS layers
- consolidation is still possible at a later point in time
For options 2 and 3 we should move the existing CLIs into the new area so that CAPI approvers can work on them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote would be for #2
above. I do not have strong feelings about whether the CLIs are unified (though I think I lean towards not unifying them) and I do feel strongly that they should be in the same repository / pipeline.
The option to consolidate the CLIs remains open as a possible incremental step, or even a build-time option ¯_(ツ)_/¯ ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to weaker wording for the CLI consolidation, i.e. leaving the option open in the RFC as implementation details. We should then move the 4 existing storage CLIs into the new area and go from there.
A pragmatic way forward could be to implement the missing operations in bosh-azure-storage-cli, validate the approach with adapted CC (CATS, perf tests), then consolidate into one repo and implement for the remaining IaaS. But we can sort this out in a WG meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for moving the CLIs to an area that's accessible to the CAPI folks.
I would like the CLIs to be consolidated into a single repo as part of this move.
- copy one existing bosh storage cli into this new repository as starting point (most likely [bosh-azure-storage-cli](https://github.com/cloudfoundry/bosh-azure-storage-cli)) | ||
- setup CI, implement missing commands and configuration parameters | ||
- continue with the remaining bosh storage CLIs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried about consolidating these repositories step-wise. It seems like there will be a long-ish period of time where the original CLI repos exist and are tested with automation while the consolidated code continues to evolve, eventually leading to a big-bang integration point where bosh
, bosh-cli
and bosh-agent
will all need to cut over and integrate the new CLI (CLIs?).
I would prefer if we went the route of consolidating the repositories (and CI pipelines) first, keeping those green, and updating the current consumers of these CLIs so that any changes are continuously integrated and tested.
The repositories used for windows-stemcell creation were recently consolidated in this manner. Relevant commit points can be seen here:
- cloudfoundry/bosh-windows-stemcell-builder@2e38c2689
- cloudfoundry/bosh-windows-stemcell-builder@87f9ed7a0
- cloudfoundry/bosh-windows-stemcell-builder@6e4ef50bd
- cloudfoundry/bosh-windows-stemcell-builder@974659b56
Roughly the process was:
- Modify the repo-to-be-consolidated so that it's directory structure matches the it's new location
- Example: with
greenhouse-ci
the contents of the repo were nested underci/
so that they could exist underbosh-windows-stemcell-builder/ci/
on a branch
- Example: with
- Re-configure pipelines to consume this branch, and validate changes work in CI
- Merge in the source repository into the target repository
- Re-configure pipelines to consume the same code from new repository, and validate changes work in CI
- Archive the old repository, adding a
README
describing where the code went
- move existing bosh storage CLIs into new area - keep the consolidation process open as implementation detail
During the TOC meeting on 29th of July to start the FCP for this with the goal to accept the RFC. Please put your review and comments so that we can decide next week. |
|
||
These storage CLIs are implemented in Go and use the respective provider golang SDKs that are well supported for the foreseeable future. | ||
|
||
Cloud Controller shall implement a new blobstore type `storage-cli` that uses the mentioned storage CLIs for blobstore operations. Missing functionality needed by the Cloud Controller shall be added to the storage CLIs in a compatible way: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CAPI supports sending arbitrary connection parameters to Fog:
- https://github.com/cloudfoundry/capi-release/blob/9f69a02ef292c1e594f8da718ca524f378e573ae/jobs/cloud_controller_ng/spec#L553
- https://docs.cloudfoundry.org/deploying/common/cc-blobstore-config.html#fog-aws-creds:~:text=(-,Optional)%20Provide%20additional%20configuration%20through%20the,hash%2C%20which%20is%20passed%20through%20to%20the%20Fog%20gem.,-Fog%20with%20AWS
- https://github.com/cloudfoundry/cloud_controller_ng/blob/bb92b9ef3c0dc9808e9945c04848561aab9b9def/lib/cloud_controller/blobstore/fog/fog_client.rb#L202-L208
Operators could theoretically be configuring stuff in there that isn't supported by storage-cli (I haven't dug into the Fog library to see what all the connection options are).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know of at least 2 more special connection parameters that we configure:
- Google: Uniform Bucket Access
- Azure: timeout for block writes
I see two strategies how to deal with it:
- add connection parameters as needed/requested by operators over time
- add a generic config param list similar to fog that gets somehow mapped to the connection configuration in the used golang SDK per IaaS
Option 1 is simpler and I would prefer it if there is a way to keep the fog implementation for some transition period. If we have to remove fog due to Ruby version update, then option 2 might be a way to go if we see at least an indication from operators that they can't live with the reduced configuration param set.
I would postpone a decision for option 2 until we have a storage-cli provider running for at least one IaaS in (near-)production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed during the TOC that this is not blocking and could be decided during the implementation phase.
Assigning number 0043 to RFC from PR #1253
This is the result of: rfc: cloudfoundry/community#1253 and: cloudfoundry/community#1275 There is a new storage-cli area which will own all the storage cli repos
Link for easy viewing