-
Notifications
You must be signed in to change notification settings - Fork 90
Open
Description
The documentation currently does not clearly explain how Quilt versioning integrates with Amazon S3 versioning. This lack of clarity makes it difficult for users to understand the guarantees Quilt provides around package updates, reproducibility, and storage.
Clarifications Needed
-
S3 Versioned Buckets
- State explicitly that Quilt requires using S3 buckets with versioning enabled.
- Without this, users may mistakenly assume Quilt handles all versioning internally.
-
Package Revisions and S3 Version IDs
- Document that each Quilt package revision maps to a specific S3 version ID for the underlying objects.
- Clarify that a revision “locks” the package to those object versions for reproducibility.
-
Updating Packages
- Provide a concrete description of how updates work.
- Example: when a user runs
aws s3 sync(or the Quilt equivalent), files are updated in place, but the package revision remains tied to fixed S3 version IDs.
-
General Applicability
- Emphasize that this workflow applies broadly to all datasets managed in Quilt on S3, not just specific examples.
Suggested Improvement
Add a dedicated section (or expand an existing one) to explicitly walk through:
- Enabling versioning on the S3 bucket.
- Creating a new Quilt package revision.
- Showing how the revision ties back to S3 version IDs.
- Demonstrating how updates preserve reproducibility while allowing incremental changes.
This would make the versioning model much clearer and reduce confusion for new users.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels