Skip to content

Conversation

stephanme
Copy link
Member

@stephanme stephanme commented Jul 18, 2025

@stephanme stephanme force-pushed the rfc-cc-storage-cli branch from 0941f6e to 8fc25e9 Compare July 21, 2025 11:53
@stephanme stephanme marked this pull request as ready for review July 21, 2025 11:54
@stephanme stephanme added the rfc CFF community RFC label Jul 21, 2025
@stephanme stephanme requested review from a team, rkoster, beyhan, Gerg and cweibel and removed request for a team July 21, 2025 11:56
@beyhan beyhan added the toc label Jul 21, 2025
@beyhan beyhan moved this from Inbox to In Progress in CF Community Jul 22, 2025
@cweibel
Copy link

cweibel commented Jul 22, 2025

I like the proposed solution. No issues from me.


- create a new "Storage CLI" area
- add existing approvers of areas "VM deployment lifecycle (BOSH)" (FI) and "CAPI" (ARI) as initial approvers to this new area
- create a new repository `storage-cli` in this area with the goal to consolidate all existing bosh storage CLIs here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be storage-clis since it will host an arbitrary set of them, or is this proposal that there is one single storage-cli which is the interface to all supported backing blobstores?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that there will be one storage-cli which supports different providers. Which provider is used could be determined via the config file which needs to be provided as a cli argument.
E.g. for azure:

{
  "provider":                    "azure" 
  "account_name":         "<string> (required)",
  "account_key":            "<string> (required)",
  "container_name":       "<string> (required)",
  "environment":             "<string> (optional, default: 'AzureCloud')",
}

So the provider field would be required and all other fields would be provider specific. S3 for example has completely different parameters (example).
The format of the config can be discussed after the RFC was accepted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idea makes sense.

I think it would be helpful for this to be a bit more structured so that provider determines the shape of a sub-section (maybe named credential), leaving the top level with keys that are consistent across providers (e.g. "bucket name")

For folks perusing here are the various configs"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like the goal is a wholesale replacement of the existing four CLIs with a single unified CLI.

I am a little worried about this becoming a complete re-write, and needing to integrate this into the various bosh codebases in one big cut-over.

If we take the approach of consolidating into a single binary with an updated interface it makes #1253 (comment) less relevant.

The git history of the individual repose should be preserved however the consolidation takes place though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a lot of value in having the clis be distinct and simple vs trying to cram them all together with lots of different flags and configs. I do support consolidating the repos for sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkoster brought up the idea to consolidate the CLIs in one repo. Arguments were:

  • code reuse: cmd line parsing, config handling (the blobstore access itself probably won't allow for any reuse)
  • easy refactoring and consistency across all CLIs

Having old CLIs and new CLI side by side for some time decouples development for good and for bad. From CC point of view we would start the implementation with Azure until it runs in production. This ensures that we get the interface and needed blobstore operations right (e.g. the POC allows resource matching from functional point of view but likely it won't perform so we may need a CLI command that supports resource matching for a list of blobs). Once Azure CLI stabilised, we would go for the remaining IaaS. At least this was the plan.

I don't see a blocker for adding the new blobstore cmds in the existing bosh-azure-storage-cli in a compatible way that doesn't break bosh. I understood that existing CI would check this compatibility which is good.

I'm open for differnet ways forward whatever fits best:

  1. new repo, copy bosh-azure-storage-cli, extend for CC, same for other IaaS, bosh switches to new storage-cli when ready

    • the approach currently proposed in the RFC
    • development of CC commands is decoupled, no risk on bosh
    • higher effort for later migration from old to new CLI in bosh, risk that incompatibilities sneak in unnoticed (could be mitigated by additonional CI validation in bosh)
  2. new repo, consolidate first, then extend for CC

    • consolidate CLIs first into a new storage-cli[s] repo (e.g. using the approach proposed below) without adding new commands needed by CC, bosh switches to the new, consolidated CLI asap
    • development of CC commands gets initially delayed a bit but might benefit from the now possible code reuse and easier refactoring
    • no late surprises
  3. no new repo, add new cmds in existing CLIs

For options 2 and 3 we should move the existing CLIs into the new area so that CAPI approvers can work on them.

Copy link
Member

@aramprice aramprice Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote would be for #2 above. I do not have strong feelings about whether the CLIs are unified (though I think I lean towards not unifying them) and I do feel strongly that they should be in the same repository / pipeline.

The option to consolidate the CLIs remains open as a possible incremental step, or even a build-time option ¯_(ツ)_/¯ ).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to weaker wording for the CLI consolidation, i.e. leaving the option open in the RFC as implementation details. We should then move the 4 existing storage CLIs into the new area and go from there.

A pragmatic way forward could be to implement the missing operations in bosh-azure-storage-cli, validate the approach with adapted CC (CATS, perf tests), then consolidate into one repo and implement for the remaining IaaS. But we can sort this out in a WG meeting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for moving the CLIs to an area that's accessible to the CAPI folks.

I would like the CLIs to be consolidated into a single repo as part of this move.

Comment on lines 46 to 48
- copy one existing bosh storage cli into this new repository as starting point (most likely [bosh-azure-storage-cli](https://github.com/cloudfoundry/bosh-azure-storage-cli))
- setup CI, implement missing commands and configuration parameters
- continue with the remaining bosh storage CLIs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about consolidating these repositories step-wise. It seems like there will be a long-ish period of time where the original CLI repos exist and are tested with automation while the consolidated code continues to evolve, eventually leading to a big-bang integration point where bosh, bosh-cli and bosh-agent will all need to cut over and integrate the new CLI (CLIs?).

I would prefer if we went the route of consolidating the repositories (and CI pipelines) first, keeping those green, and updating the current consumers of these CLIs so that any changes are continuously integrated and tested.

The repositories used for windows-stemcell creation were recently consolidated in this manner. Relevant commit points can be seen here:

Roughly the process was:

  • Modify the repo-to-be-consolidated so that it's directory structure matches the it's new location
    • Example: with greenhouse-ci the contents of the repo were nested under ci/ so that they could exist under bosh-windows-stemcell-builder/ci/ on a branch
  • Re-configure pipelines to consume this branch, and validate changes work in CI
  • Merge in the source repository into the target repository
  • Re-configure pipelines to consume the same code from new repository, and validate changes work in CI
  • Archive the old repository, adding a README describing where the code went

rkoster
rkoster previously approved these changes Jul 29, 2025
- move existing bosh storage CLIs into new area
- keep the consolidation process open as implementation detail
@beyhan
Copy link
Member

beyhan commented Jul 29, 2025

During the TOC meeting on 29th of July to start the FCP for this with the goal to accept the RFC. Please put your review and comments so that we can decide next week.


These storage CLIs are implemented in Go and use the respective provider golang SDKs that are well supported for the foreseeable future.

Cloud Controller shall implement a new blobstore type `storage-cli` that uses the mentioned storage CLIs for blobstore operations. Missing functionality needed by the Cloud Controller shall be added to the storage CLIs in a compatible way:
Copy link
Member

@Gerg Gerg Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know of at least 2 more special connection parameters that we configure:

  • Google: Uniform Bucket Access
  • Azure: timeout for block writes

I see two strategies how to deal with it:

  1. add connection parameters as needed/requested by operators over time
  2. add a generic config param list similar to fog that gets somehow mapped to the connection configuration in the used golang SDK per IaaS

Option 1 is simpler and I would prefer it if there is a way to keep the fog implementation for some transition period. If we have to remove fog due to Ruby version update, then option 2 might be a way to go if we see at least an indication from operators that they can't live with the reduced configuration param set.

I would postpone a decision for option 2 until we have a storage-cli provider running for at least one IaaS in (near-)production.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed during the TOC that this is not blocking and could be decided during the implementation phase.

@beyhan beyhan merged commit 7213843 into main Aug 5, 2025
1 check passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in CF Community Aug 5, 2025
@beyhan beyhan deleted the rfc-cc-storage-cli branch August 5, 2025 14:56
beyhan added a commit that referenced this pull request Aug 5, 2025
beyhan added a commit that referenced this pull request Aug 6, 2025
beyhan added a commit to cloudfoundry/concourse-infra-for-fiwg that referenced this pull request Aug 8, 2025
This is the result of:
rfc: cloudfoundry/community#1253
and: cloudfoundry/community#1275

There is a new storage-cli area which will own all the storage cli repos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc CFF community RFC toc
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants