Skip to content

Commit 396bc14

Browse files
committed
first version of README
1 parent 0833efa commit 396bc14

File tree

1 file changed

+90
-0
lines changed

1 file changed

+90
-0
lines changed

README.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,93 @@
1111
<img alt="GitHub release" src="https://img.shields.io/github/release/huggingface/huggingface_hub.svg">
1212
</a>
1313
</p>
14+
15+
> **Do you have an open source ML library?**
16+
> We're looking to partner with a small number of other cool open source ML libraries to provide model hosting + versioning.
17+
> https://twitter.com/julien_c/status/1336374565157679104 https://twitter.com/mnlpariente/status/1336277058062852096
18+
> Let us know if interested 😎
19+
20+
<br>
21+
22+
### ♻️ Partial list of implementations in third party libraries:
23+
24+
- http://github.com/asteroid-team/asteroid [[initial PR 👀](https://github.com/asteroid-team/asteroid/pull/377)]
25+
- https://github.com/pyannote/pyannote-audio [[initial PR 👀](https://github.com/pyannote/pyannote-audio/pull/549)]
26+
- https://github.com/flairNLP/flair [[work-in-progress, initial PR 👀](https://github.com/flairNLP/flair/pull/1974)]
27+
28+
<br>
29+
30+
## Download files from the huggingface.co hub
31+
32+
Integration inside a library is super simple. We expose two functions, `hf_hub_url()` and `cached_download()`.
33+
34+
### `hf_hub_url`
35+
36+
`hf_hub_url()` takes:
37+
- a model id (like `julien-c/EsperBERTo-small`),
38+
- a filename (like `pytorch_model.bin`),
39+
- and an optional git revision id (can be a branch name, a tag, or a commit hash)
40+
41+
and returns the url we'll use to download the actual files: `https://huggingface.co/julien-c/EsperBERTo-small/resolve/main/pytorch_model.bin`
42+
43+
If you check out this URL's headers with a `HEAD` http request (which you can do from the command line with `curl -I`) for a few different files, you'll see that:
44+
- small files are returned directly
45+
- large files (i.e. the ones stored through [git-lfs](https://git-lfs.github.com/)) are returned via a redirect to a Cloudfront URL. Cloudfront is a Content Delivery Network, or CDN, that ensures that downloads are as fast as possible from anywhere on the globe.
46+
47+
### `cached_download`
48+
49+
`cached_download()` takes the following parameters, downloads the remote file, stores it to disk (in a versioning-aware way) and returns its local file path.
50+
51+
Parameters:
52+
- a remote `url`
53+
- your library's name and version (`library_name` and `library_version`), which will be added to the HTTP requests' user-agent so that we can provide some usage stats.
54+
- a `cache_dir` which you can specify if you want to control where on disk the files are cached.
55+
56+
Check out the source code for all possible params (we'll create a real doc page in the future).
57+
58+
<br>
59+
60+
## Publish models to the huggingface.co hub
61+
62+
Uploading a model to the hub is super simple too:
63+
- create a model repo directly from the website, at huggingface.co/new (models can be public or private, and are namespaced under either a user or an organization)
64+
- clone it with git
65+
- install [lfs](https://git-lfs.github.com/) with `git lfs install` if you haven't done that before
66+
- add, commit and push your files, from git, as you usually do.
67+
68+
**We are intentionally not wrapping git too much, so that you can go on with the workflow you’re used to and the tools you already know.**
69+
70+
> 👀 To see an example of how we document the model sharing process in `transformers`, check out https://huggingface.co/transformers/model_sharing.html
71+
72+
### API utilities in `hf_api.py`
73+
74+
You don't need them for the standard publishing workflow, however, if you need a programmatic way of creating a repo, deleting it (`⚠️ caution`), or listing models from the hub, you'll find helpers in `hf_api.py`.
75+
76+
### `huggingface-cli`
77+
78+
Those API utilities are also exposed through a CLI:
79+
80+
```bash
81+
huggingface-cli login
82+
huggingface-cli logout
83+
huggingface-cli whoami
84+
huggingface-cli repo create
85+
```
86+
87+
### Need to upload large (>5GB) files?
88+
89+
To upload large files (>5GB 🔥), you need to install the custom transfer agent for git-lfs, bundled in this package. Spec for LFS custom transfer agent is:
90+
https://github.com/git-lfs/git-lfs/blob/master/docs/custom-transfers.md
91+
92+
To install, just run:
93+
94+
```bash
95+
$ huggingface-cli lfs-enable-largefiles
96+
```
97+
98+
This should be executed once for each model repo that contains a model file >5GB. It's documented in the error
99+
message you get if you just try to git push a 5GB file without enabling it before.
100+
101+
Finally, there's a `huggingface-cli lfs-multipart-upload` command but that one is internal (called by lfs directly) and is not meant to be called by the user.
102+
103+
## Feedback (feature requests, bugs, etc.) is super welcome 💙💚💛💜♥️🧡

0 commit comments

Comments
 (0)