Skip to content

Commit b0b8629

Browse files
SBrandeiscoyotte508
authored andcommitted
✨ New create_commit API (#888)
* ✨ New `commit_files` API * ✨ commit_folder utility * 🔊 Add debug logs * 💄 Code quality * 🚚 Move `hf_hub_url` to `.utils.endpoint_helpers` To avoid a circular import * ♻ Refactoring - Rename `commit_files` -> `create_commit` - Refactor `upload_file`, `commit_folder` and `delete_file` to use `create_commit` - Add `create_pr` kwarg to create a Pull Request - Add support for File objects and `bytes` in `create_commit` - Move `hf_hub_url` to `utils.endpoint_helpers` to avoid a circular import * 🧪 Add tests * 🚚 `commit_folder` -> `upload_folder` * Re-export CommitOperations * Re-exports * ⏪ Revert changes to testing_constants * 💚 Post-rebase fixes * ♻ Refactoring and docstrings * Refacto * 💚 Fix imports for 3.7 * 🔥 Remove dubious type hints * Cleaner use of SliceFileObj * 🚚 Regroup LFS utils together * ✅ Add tests for sha_fileobj * 🎨 More consistent naming * ✅ Test UploadInfo * ✅ Add unit tests for SliceFileObj ctx manager * 💄 Code quality 🙈 * 🩹 Fix import in commands 🤡 * 🏗 Change staging URL * ✅ Fix tests ? * ✅ Use `hf_hub_download` API in tests * 📝 Add docstring for `create_commit` API * 🚧 Multi-threaded upload in `create_commit` * ♻ Refactoring of multi-threaded upload Thanks @Wauplin !! * 👌 Changes from code review - 📝 Add and update documentation - ♻ Remove TypedDict usage and introduce ad-hoc validation - 🧪 Re-add a removed test - 🎨 Simplify `CommitOperationAdd.b64content` * ✅ Various small fixes Fix typo Fix URL prefix * Add support for verify action * ✅ Fix url prefix logic in post_lfs_batch_info * ✨ Make SliceFileObj seekable * 👌 Changes from code review - 💄 HF Hub -> Hub - 💄 Documentation conventions: single backticks - 💄 Documentation consistency / improvements - 🎨 `commit_summary` -> `commit_message` Co-authored-by: Omar Sansevioro <[email protected]> * 💄 Testing conventions * 📝 Some more documentation * ✨ Update update_metadata method * ↩ Backwards-compatibility & deprecation * 🔒 Pass `token` to Hub requests * ✅ Test binary file upload * ✅ Fix verify call * ✅ Also test uploads in private repos * ✨ Support revisions with slashes in hf_hub_url * ✅ Add a test for creating a PR with `create_commit` * Change deprecation version to 0.10.0 * 🎨 Code quality * Fixup merge * Revert "Fixup merge" This reverts commit 3e82aad. * ✅ Fix create_pr test * ✅ Fix upload_folder test * 🩹 Remove Authorization header when downloading a LFS file * 💄 Code quality * 📝 Deprecation period consistency * ⚰ Remove _legacy_upload_file * 🚚 `commit_api` -> `_commit_api` * 🎨 API changes - `CommitOperationAdd.upload_info` -> `CommitOperationAdd._upload_info` (make private) - `CommitOperationAdd.fileobj` -> `CommitOperationAdd.as_file` (clearer) - More thorough documentation for `CommitOperationAdd.validate` * ♻ `upload_file` & `upload_folder` consistency - Backwards-compatibility (don't change the semantic nor the return value) - Consistent behavior between the two methods - Add some tests * ⏪ Revert move of hf_hub_url method * ⏪ re-add FileSlice and deprecate it * 🩹 Small fixes & dedup * ✅ Fix revision and raise RuntimeEror * 🔗 Correct syntax for hyperlinks * ✅ Oopsie * 🔥 Remove FileSlice once again * 💄 Code quality * 📝 Add some documentation about `create_commit` * ♻ Make validation func private * 🩹 Fix typo Co-authored-by: Eliott Coyac <[email protected]> Co-authored-by: Eliott Coyac <[email protected]>
1 parent 8e0566b commit b0b8629

File tree

12 files changed

+1907
-146
lines changed

12 files changed

+1907
-146
lines changed

docs/source/how-to-upstream.mdx

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ Sharing your files and work is a very important aspect of the Hub. The `huggingf
55
* Push files with a `commit` context manager.
66
* Push files with the [`~Repository.push_to_hub`] function.
77
* Upload very large files with [Git LFS](https://git-lfs.github.com/).
8+
* Push files without Git installed with [`HfApi`]
89

910
Whenever you want to upload files to the Hub, you need to log in to your Hugging Face account:
1011

@@ -113,4 +114,40 @@ For huge files (>5GB), you need to install a custom transfer agent for Git LFS:
113114
huggingface-cli lfs-enable-largefiles
114115
```
115116

116-
You should install this for each model repository that contains a model file. Once installed, you are now able to push files larger than 5GB.
117+
You should install this for each model repository that contains a model file. Once installed, you are now able to push files larger than 5GB.
118+
119+
## Managing files in a repo without Git with the `create_commit` API
120+
121+
`huggingface_hub` also offers a way to upload files to the Hub without Git installed on your system with the [`create_commit`] method of [`HfApi`].
122+
For example, if you want to upload two files and delete another file in a Hub repo:
123+
124+
```py
125+
>>> from huggingface_hub import HfApi, CommitOperationAdd, CommitOperationDelete
126+
>>> api = HfApi()
127+
>>> operations = [
128+
... CommitOperationAdd(path_in_repo="LICENSE.md", path_or_fileobj="~/repo/LICENSE.md"),
129+
... CommitOperationAdd(path_in_repo="weights.h5", path_or_fileobj="~/repo/weights-final.h5"),
130+
... CommitOperationDelete(path_in_repo="old-weights.h5"),
131+
... ]
132+
>>> api.create_commit(
133+
... repo_id="lysandre/test-model",
134+
... operations=operations,
135+
... )
136+
```
137+
138+
[`create_commit`] uses the HTTP protocol to upload files to the Hub. It automatically takes care of uploading large files and binary files with the Git LFS protocol.
139+
There are currently two kind of operations supported by the [`create_commit`] method:
140+
141+
1. [`CommitOperationAdd`] to upload a file to the Hub. If the file already exists, its content will be overwritten. It takes two arguments:
142+
* `path_in_repo`: the path in the repository where the file should be uploaded
143+
* `path_or_fileobj`: either a path to a file on your filesystem, or a file-like object. The content of the file to upload to the Hub.
144+
2. [`CommitOperationDelete`] to remove a file from a repository. It takes `path_in_repo` as an argument.
145+
146+
Instead of [`create_commit`], you can also use the following convenience methods:
147+
* [`upload_file`] to upload a single file to a repo on the Hub
148+
* [`upload_folder`] to upload a local directory to a repo on the Hub
149+
* [`delete_file`] to delete a single file from a repo on the Hub
150+
* [`metadata_update`] to update a repo's metadata
151+
152+
All these methods use the `create_commit` API under the hood.
153+
For a more detailed description, visit the [`hf_api`] documentation page.

src/huggingface_hub/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,10 +143,14 @@ def __dir__():
143143
],
144144
"file_download": ["cached_download", "hf_hub_download", "hf_hub_url"],
145145
"hf_api": [
146+
"CommitOperation",
147+
"CommitOperationAdd",
148+
"CommitOperationDelete",
146149
"DatasetSearchArguments",
147150
"HfApi",
148151
"HfFolder",
149152
"ModelSearchArguments",
153+
"create_commit",
150154
"create_repo",
151155
"dataset_info",
152156
"delete_file",
@@ -168,6 +172,7 @@ def __dir__():
168172
"unset_access_token",
169173
"update_repo_visibility",
170174
"upload_file",
175+
"upload_folder",
171176
"whoami",
172177
],
173178
"hub_mixin": ["ModelHubMixin", "PyTorchModelHubMixin"],

0 commit comments

Comments
 (0)