Hello, and I apologize for multiple issues in a single day.
When pulling good sized models from HuggingFace, it is very easy to hit 429 Too Many Requests, especially on unauthenticated fetches.
Ideally, docker model pull would inspect the environment for HF_TOKEN and pass that along as a Bearer token when making requests to huggingface.
writing blob: get blob contents: GET https://huggingface.co/v2/unsloth/kimi-k2-instruct-0905-gguf/blobs/sha256:c4cb7c5746e236dc288cb74da606d8f49509eaa4df21b853e6476c942726727c: unexpected status code 429 Too Many Requests: {"error":"We had to rate limit your IP (X.X.X.X). To continue using our service, create a HF account or login to your existing account, and make sure you pass a HF_TOKEN if you're using the API."}