Skip to content

Conversation

WhitWaldo
Copy link

Increasingly, while writing applications that use Dapr, I keep running into the need to persist data that's too large to reasonably store using Dapr often because it's too large and will exhaust the memory resources of the sidecar, though frequently because it's likely too large to store in a key/value store.

It doesn't make a ton of sense to rely exclusively on bindings for this when that really just provides a Dapr-hosted alternative to the provider's SDK for something that we should increasingly have broad provider support for. Object and blob stores are really overloaded terms representing all manner of things depending on provider for which I think there's a fine opportunity to tackle in the future - this proposal isn't that.

Here, I propose an API devoid of List and even Metadata operations so it can accommodate the broadest of possible storage providers and instead suggest that we increasingly lean on the SDKs to provide the state management instead of putting all that weight on the runtime and the components. It's a slim implementation that should be pretty easily added, but which would provide immediate benefits for popular Dapr features: Workflows and the new Agentic operations come to mind, but it would be beneficial for Actor and Cryptographic operations as well.

I look forward to your feedback!

@WhitWaldo WhitWaldo self-assigned this Aug 23, 2025
@WhitWaldo WhitWaldo added the enhancement New feature or request label Aug 23, 2025
…a few details, removed an extraneous bullet and generally cleaned it up some

Signed-off-by: Whit Waldo <[email protected]>
@WhitWaldo WhitWaldo changed the title Proposal: File Store Building Block Proposal: Binary Store Building Block Aug 24, 2025
Signed-off-by: Whit Waldo <[email protected]>
@olitomlinson
Copy link

olitomlinson commented Sep 1, 2025

I'm massively in support, but how does this differ from the Object Store proposal? (Other than no support for metadata, anything else?)

@WhitWaldo
Copy link
Author

I'm massively in support, but how does this differ from the Object Store proposal? (Other than no support for metadata, anything else?)

There are a few differences:

  1. This proposal does not anticipate ever supporting a list operation so as to be more readily and broadly supported by those providers without such capability. Most specific object and blob stores that come to mind do offer such a feature. Leaving this feature as a possible differentiator for a future object/blob store API, although this would limit it to a smaller set of matching providers, is a fine trade-off to simply do without altogether here. This is looking to be little more than a provider to store large files in a way the current state store cannot and without all the other current state management add-ons.
  2. This does not purport to offer those behaviors that might be more specific to object and blob stores to perform operations on data through signed URLs. Again, that might be a fine feature to use in a future state store that's more narrowly tailed to that sort of operation. This isn't that.
  3. As you indicated, object and blob stores often persist and maintain a lot of metadata. In my experience, blob stores mostly just store it, but object stores will often act on it (e.g. checksum validation). No need to deal with any of that here, including several of the points brought up in your linked discussion (e.g. Content-Length, Content-Hash, ETag, and other metadata being used for other extraneous purposes).
  4. We talked about my goal here to avoid having the SDKs deal with serialization here. An object or blob store often handles unstructured data in some format or another and I think we should absolutely create more specialized data stores that support operations more suited to one type or another (certainly could be useful from an agentic tooling and pluggable component perspective), but here, in the name of simplification and starting with a low threshold, I would like to put the responsibility on the developer for ensuring that their data can be serialized and encoded and have the API exclusively persist, retrieve and delete that data with no room for any other possibilities.
  5. Object and blob store often support hierarchical or operational permissions structures such as append-only writes, write-only permissions (e.g. no deletion via API), etc. That's also intentionally excluded from consideration here.

Put more simply - those other stores anticipate the developer wanting to do both simple and far more advanced operations with their data. I'd certainly like to build more specialized data stores to accommodate such requirements, but this proposal seeks to do away with any complexities and do one thing really well: manage the reading, writing and deletion of large files in a resource-limiting and highly performant manner which is not possible in today's Dapr state management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants