hf cli: Extremely slow download of shards toward the end (or hanging)

### Describe the bug

```bash
$ hf version
1.1.6
```

When downloading a publicly available model, the download will get very slow after it reaches 99-100% but there is still a small part of the shard to be downloaded. 

Whether I download all files at once in parallel or one by one, the bug still happens.

I tried with and without `HF_XET_HIGH_PERFORMANCE=1 `.

Process shown to be sleeping:
```bash
$ cat /proc/<pid>/status
Name:   hf
Umask:  0022
State:  S (sleeping)
```

hf prints:
HF_XET_HIGH_PERFORMANCE: False
I waited a long time before copypasting the progress bar. It went from `49.5` to `49.9`.
```
Downloading (incomplete total...): 100%|███████▉| 49.9G/50.0G [12:30<02:19, 960kB/s]
```
HF_XET_HIGH_PERFORMANCE: True

```bash
# first hang
Downloading (incomplete total...):  99%|███████▉| 49.5G/50.0G [07:30<00:01, 304MB/s]
# after 5 minutes sleeping (according to /proc//status)
Downloading (incomplete total...):  99%|███████▉| 49.6G/50.0G [12:30<07:44, 867kB/s
# update 
Downloading (incomplete total...):  99%|███████▉| 49.7G/50.0G [14:10<06:47, 823kB/s]
# update
Downloading (incomplete total...):  99%|███████▉| 49.7G/50.0G [29:00<20:28, 218kB/s]
# update
Downloading (incomplete total...): 100%|████▉| 49.8G/50.0G [1:08:01<44:27, 75.5kB/s]
```

I can see the file is being downloaded and is stored in `.cache`
```
47G     /mnt/disk1/kimi_gguf/.cache
0       /mnt/disk1/kimi_gguf/.check_for_update_done
144G    /mnt/disk1/kimi_gguf/UD-IQ2_M
38M     /mnt/disk1/kimi_gguf/cli
16K     /mnt/disk1/kimi_gguf/hub
16M     /mnt/disk1/kimi_gguf/xet
47G     /mnt/disk1/kimi_gguf/UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00001-of-00008.gguf
47G     /mnt/disk1/kimi_gguf/UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00002-of-00008.gguf
45G     /mnt/disk1/kimi_gguf/UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00007-of-00008.gguf
6.7G    /mnt/disk1/kimi_gguf/UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00008-of-00008.gguf
```
(2 shards were successfully downloaded in parallel without hf xet, 2 more one by one without hf xet). 

`HF_HUB_ENABLE_HF_TRANSFER` was never set because it is deprecated.

Also tried running outside of tmux session in case that was the problem.

Internet seems to be fine while `hf download` is running slow at the end.
```bash
$ speedtest --simple --no-upload
Ping: 13.908 ms
Download: 526.27 Mbit/s
```

### Reproduction

```bash
export HF_DEBUG=1 # only set to report this issue
export HF_HOME=/mnt/disk1/kimi_gguf
export HF_XET_HIGH_PERFORMANCE=1 # or 0, doesn't seem to matter
```

`hf download unsloth/Kimi-K2-Thinking-GGUF   --include "UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00003*.gguf"   --local-dir /mnt/disk1/kimi_gguf/`

(I don't think the model's repo matters but I still included it, maybe I'm wrong)

### Logs

```shell
Request d377d236-8d48-487c-9714-ac7b01e51eb3: GET https://huggingface.co/api/models/unsloth/Kimi-K2-Thinking-GGUF/revision/main (authenticated: False)
Send: curl -X GET -H 'accept: */*' -H 'accept-encoding: gzip, deflate' -H 'connection: keep-alive' -H 'host: huggingface.co' -H 'user-agent: huggingface-cli/None; hf_hub/1.1.6; python/3.10.12' -H 'x-amzn-trace-id: d377d236-8d48-487c-9714-ac7b01e51eb3' -d https://huggingface.co/api/models/unsloth/Kimi-K2-Thinking-GGUF/revision/main
Downloading (incomplete total...): 0.00B [00:00, ?B/s]                             Request 4b77b23a-b687-4bc3-a744-f0de9d8f6f2a: HEAD https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF/resolve/1763c1383c0241abc444c88dd1b1549383fa2a0f/UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00003-of-00008.gguf (authenticated: False)
Send: curl -X HEAD -H 'accept: */*' -H 'accept-encoding: identity' -H 'connection: keep-alive' -H 'host: huggingface.co' -H 'user-agent: huggingface-cli/None; hf_hub/1.1.6; python/3.10.12' -H 'x-amzn-trace-id: 4b77b23a-b687-4bc3-a744-f0de9d8f6f2a' -d https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF/resolve/1763c1383c0241abc444c88dd1b1549383fa2a0f/UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00003-of-00008.gguf
Downloading 'UD-IQ2_M/Kimi-K2-Thinking-UD-IQ2_M-00003-of-00008.gguf' to '/mnt/disk1/kimi_gguf/.cache/huggingface/download/UD-IQ2_M/NDuTTget7Ici5CVSFSM_iO3-hBY=.64d043b0ffb3f4eb07bed045d734f6ded0f122f54912269898169a7b3a96c21a.incomplete'
Xet Storage is enabled for this repo. Downloading file from Xet Storage..
Request 2705e74e-84b9-4387-b586-104639d98944: GET https://huggingface.co/api/models/unsloth/Kimi-K2-Thinking-GGUF/xet-read-token/1763c1383c0241abc444c88dd1b1549383fa2a0f (authenticated: False)
Send: curl -X GET -H 'accept: */*' -H 'accept-encoding: gzip, deflate' -H 'connection: keep-alive' -H 'host: huggingface.co' -H 'user-agent: huggingface-cli/None; hf_hub/1.1.6; python/3.10.12' -H 'x-amzn-trace-id: 2705e74e-84b9-4387-b586-104639d98944' -d https://huggingface.co/api/models/unsloth/Kimi-K2-Thinking-GGUF/xet-read-token/1763c1383c0241abc444c88dd1b1549383fa2a0f
```

### System info

```shell
- huggingface_hub version: 1.1.6
- Platform: Linux-6.8.0-87-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: /mnt/disk1/kimi_gguf/token
- Has saved token ?: False
- Configured git credential helpers: 
- Installation method: hf_installer
- httpx: 0.28.1
- hf_xet: 1.2.0
- gradio: N/A
- tensorboard: N/A
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /mnt/disk1/kimi_gguf/hub
- HF_ASSETS_CACHE: /mnt/disk1/kimi_gguf/assets
- HF_TOKEN_PATH: /mnt/disk1/kimi_gguf/token
- HF_STORED_TOKENS_PATH: /mnt/disk1/kimi_gguf/stored_tokens
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_DISABLE_XET: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
- HF_XET_HIGH_PERFORMANCE: True
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

hf cli: Extremely slow download of shards toward the end (or hanging) #3580

Describe the bug

Reproduction

Logs

System info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

hf cli: Extremely slow download of shards toward the end (or hanging) #3580

Description

Describe the bug

Reproduction

Logs

System info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions