Skip to content

Conversation

@lhoestq
Copy link
Member

@lhoestq lhoestq commented Jul 10, 2025

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@lhoestq lhoestq marked this pull request as ready for review July 11, 2025 14:04
@hanouticelina hanouticelina self-assigned this Jul 11, 2025
@lhoestq
Copy link
Member Author

lhoestq commented Jul 11, 2025

This is ready for review for the launch in the coming days ! Would be cool to do a release right after we merge

Btw I integrated your addition @davanstrien from lhoestq/hfjobs#8 and added some useful uv options: --with and --python (we could add more later if needed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine as an experiment, but not a huge fan of the local file uploading to a remote repo..

Is there any way to either:

  • pass the file content as an argument (string) to uv (and thus to the Jobs creation API)
  • ask the infra team to add a new feature to the Jobs creation API where you can a dict of file name to file contents and they are exposed to the docker command? (not sure if it's feasible @christophe-rannou)

Copy link
Member

@davanstrien davanstrien Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass the file content as an argument (string) to uv (and thus to the Jobs creation API)

I don't think this is directly possible in UV at the moment.

ask the infra team to add a new feature to the Jobs creation API where you can a dict of file name to file contents and they are exposed to the docker command? (not sure if it's feasible @christophe-rannou)

Think this would be nice if it was possible. @christophe-rannou, would this be difficult to implement?

Copy link
Member

@davanstrien davanstrien Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the logic of doing using a repo as a backend was to open up options to explore approaches where you could do something like

huggingface-cli jobs uv run --from-repo davanstrien/nice-data-generation-pipeline

I think for that to fully make sense, it would probably also be better to have a "generic" or "code" repo type rather than using a dataset as the storage repo.

@Wauplin
Copy link
Contributor

Wauplin commented Jul 15, 2025

As I high-level comment, it'd be good to have all the API logic added to HfApi (and therefore callable from a Python script) and the CLI logic (e.g. args, result formatting, etc.) kept in the current ./src/huggingface_hub/commands/jobs folder. @lhoestq Let us know if you have bandwidth to work on this or if you want some help

@lhoestq
Copy link
Member Author

lhoestq commented Jul 15, 2025

As I high-level comment, it'd be good to have all the API logic added to HfApi (and therefore callable from a Python script) and the CLI logic (e.g. args, result formatting, etc.) kept in the current ./src/huggingface_hub/commands/jobs folder. @lhoestq Let us know if you have bandwidth to work on this or if you want some help

I can take care of this for tomorrow

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice UX, I like it! 🔥 i left some initial comments.
As mentioned by @Wauplin, let's centralize the logic into HfApi:

HfApi.run_job(...)
HfApi.list_jobs(...)
HfApi.inspect_job(...)
HfApi.cancel_job(...)
HfApi.fetch_job_logs(...)

that way, the CLI subcommands become lightweight wrappers and maybe we can put all the sub parsers (run, ps, inspect, logs, cancel, uv) inside one src/huggingface_hub/commands/jobs.py file.

@lhoestq
Copy link
Member Author

lhoestq commented Jul 22, 2025

I took your comments into account and added namespace in the HfApimethods and in the CLI :)

I also added the documentation page

@davanstrien
Copy link
Member

Imo it would be good to still have support for passing an image when using uv run. Many ai focused images have uv included now and it can make the setup much more reliable i.e. for vllm it works out of the box using their default image. If we pass the astral image there is still a lot more steps to get things working smoothly

@lhoestq
Copy link
Member Author

lhoestq commented Jul 23, 2025

I added the image arg to run_uv_job(). And also --image to huggingface-cli jobs uv run :)

@Wauplin
Copy link
Contributor

Wauplin commented Jul 23, 2025

I also just pushed a commit to use SpaceHardware as a list of possible flavors (I forgot we had this one^^) eaaa6a1. Hopefully makes it easier for users to know what they can chose from:

✗ huggingface-cli jobs run --help                                                     
usage: huggingface-cli <command> [<args>] jobs run [-h] [-e ENV] [-s SECRETS] [--env-file ENV_FILE] [--secrets-file SECRETS_FILE] [--flavor FLAVOR] [--timeout TIMEOUT] [-d]
                                                   [--namespace NAMESPACE] [--token TOKEN]
                                                   image ...

positional arguments:
  image                 The Docker image to use.
  command               The command to run.

options:
  -h, --help            show this help message and exit
  -e ENV, --env ENV     Set environment variables.
  -s SECRETS, --secrets SECRETS
                        Set secret environment variables.
  --env-file ENV_FILE   Read in a file of environment variables.
  --secrets-file SECRETS_FILE
                        Read in a file of secret environment variables.
  --flavor FLAVOR       Flavor for the hardware, as in HF Spaces. Defaults to `cpu-basic`. Possible values: cpu-basic, cpu-upgrade, cpu-xl, zero-a10g, t4-small, t4-medium, l4x1, l4x4,
                        l40sx1, l40sx4, l40sx8, a10g-small, a10g-large, a10g-largex2, a10g-largex4, a100-large, h100, h100x8.
  --timeout TIMEOUT     Max duration: int/float with s (seconds, default), m (minutes), h (hours) or d (days).
  -d, --detach          Run the Job in the background and print the Job ID.
  --namespace NAMESPACE
                        The namespace where the Job will be created. Defaults to the current user's namespace.
  --token TOKEN         A User Access Token generated from https://huggingface.co/settings/tokens

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally approved ✔️ 😄
Thanks a lot @lhoestq @davanstrien for the work on this tool! 🔥

@julien-c
Copy link
Member

exclude zero-a10g from the flavors possibly because i don't think this is a valid hardware here

@Wauplin
Copy link
Contributor

Wauplin commented Jul 23, 2025

exclude zero-a10g from the flavors possibly because i don't think this is a valid hardware here

Good call, removed in e6043ae

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@lhoestq
Copy link
Member Author

lhoestq commented Jul 23, 2025

thanks for the reviews :) merging once the CI is green (edit: and also after a final commit of a missing docstring)

@lhoestq lhoestq merged commit 6e31992 into main Jul 23, 2025
23 of 25 checks passed
@lhoestq lhoestq deleted the jobs branch July 23, 2025 13:37
Comment on lines +16 to +18
Usage:
# run a job
huggingface-cli jobs run image command
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) expanding the top level docstring to list the other jobs sub commands
(cc @Wauplin, since the PR has been merged, we can do that in the follow-up PR that will switch huggingface-cli jobs -> hf jobs)

Suggested change
Usage:
# run a job
huggingface-cli jobs run image command
Usage:
# run a job
huggingface-cli jobs run <image> <command>
# List running or completed jobs
huggingface-cli jobs ps [-a] [-f key=value] [--format TEMPLATE]
# Stream logs from a job
huggingface-cli jobs logs <job-id>
# Inspect detailed information about a job
huggingface-cli jobs inspect <job-id>
# Cancel a running job
huggingface-cli jobs cancel <job-id>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in #3250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants