Add huggingface-spaces skill

mishig25 · claude · mishig25 · commit 2ae80bd2b8c5 · 2026-04-17T12:01:59.000+02:00
Leverages the new `hf spaces search` CLI and the per-Space `/agents.md` endpoint (see huggingface#122) so agents can discover Gradio Spaces and call their APIs without manual setup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -73,6 +73,12 @@
       "source": "./skills/huggingface-vision-trainer",
       "skills": "./",
       "description": "Train and fine-tune object detection models (RTDETRv2, YOLOS, DETR and others) and image classification models (timm and transformers models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3) using Transformers Trainer API on Hugging Face Jobs infrastructure or locally. Includes COCO dataset format support, Albumentations augmentation, mAP/mAR metrics, trackio tracking, hardware selection, and Hub persistence."
+    },
+    {
+      "name": "huggingface-spaces",
+      "source": "./skills/huggingface-spaces",
+      "skills": "./",
+      "description": "Find and call Hugging Face Spaces to generate AI artifacts (images, audio, 3D models, etc). Uses semantic search to discover Spaces, then calls their Gradio APIs."
     }
   ]
 }
diff --git a/README.md b/README.md
@@ -97,6 +97,7 @@ This repository contains a few skills to get you started. You can also contribut
 | `huggingface-llm-trainer` | Train or fine-tune language models using TRL on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes hardware selection, cost estimation, Trackio monitoring, and Hub persistence. | [SKILL.md](skills/huggingface-llm-trainer/SKILL.md) |
 | `huggingface-paper-publisher` | Publish and manage research papers on Hugging Face Hub. Supports creating paper pages, linking papers to models/datasets, claiming authorship, and generating professional markdown-based research articles. | [SKILL.md](skills/huggingface-paper-publisher/SKILL.md) |
 | `huggingface-papers` | Look up and read Hugging Face paper pages in markdown, and use the papers API for structured metadata like authors, linked models, datasets, Spaces, and media URLs when needed. | [SKILL.md](skills/huggingface-papers/SKILL.md) |
+| `huggingface-spaces` | Find and call Hugging Face Spaces to generate AI artifacts (images, audio, 3D models, etc). Uses semantic search to discover Spaces, then calls their Gradio APIs. | [SKILL.md](skills/huggingface-spaces/SKILL.md) |
 | `huggingface-tool-builder` | Build reusable scripts for Hugging Face Hub and API workflows. Useful for chaining API calls, enriching Hub metadata, or automating repeated tasks. | [SKILL.md](skills/huggingface-tool-builder/SKILL.md) |
 | `huggingface-trackio` | Track and visualize ML training experiments with Trackio. Log metrics via Python API and retrieve them via CLI. Supports real-time dashboards synced to HF Spaces. | [SKILL.md](skills/huggingface-trackio/SKILL.md) |
 | `huggingface-vision-trainer` | Train and fine-tune object detection models (RTDETRv2, YOLOS, DETR and others) and image classification models (timm and transformers models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3) using Transformers Trainer API on Hugging Face Jobs infrastructure or locally. Includes COCO dataset format support, Albumentations augmentation, mAP/mAR metrics, trackio tracking, hardware selection, and Hub persistence. | [SKILL.md](skills/huggingface-vision-trainer/SKILL.md) |
diff --git a/agents/AGENTS.md b/agents/AGENTS.md
@@ -10,6 +10,7 @@ These skills are:
  - huggingface-llm-trainer -> "skills/huggingface-llm-trainer/SKILL.md"
  - huggingface-paper-publisher -> "skills/huggingface-paper-publisher/SKILL.md"
  - huggingface-papers -> "skills/huggingface-papers/SKILL.md"
+ - huggingface-spaces -> "skills/huggingface-spaces/SKILL.md"
  - huggingface-tool-builder -> "skills/huggingface-tool-builder/SKILL.md"
  - huggingface-trackio -> "skills/huggingface-trackio/SKILL.md"
  - huggingface-vision-trainer -> "skills/huggingface-vision-trainer/SKILL.md"
@@ -26,6 +27,7 @@ huggingface-gradio: `Build Gradio web UIs and demos in Python. Use when creating
 huggingface-llm-trainer: `Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, model selection/leaderboards and model persistence. Use for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.`
 huggingface-paper-publisher: `Publish and manage research papers on Hugging Face Hub. Supports creating paper pages, linking papers to models/datasets, claiming authorship, and generating professional markdown-based research articles.`
 huggingface-papers: `Look up and read Hugging Face paper pages in markdown, and use the papers API for structured metadata such as authors, linked models/datasets/spaces, Github repo and project page. Use when the user shares a Hugging Face paper page URL, an arXiv URL or ID, or asks to summarize, explain, or analyze an AI research paper.`
+huggingface-spaces: `Find and call Hugging Face Spaces to generate AI artifacts (images, audio, 3D models, etc). Uses semantic search to discover Spaces, then calls their Gradio APIs.`
 huggingface-tool-builder: `Use this skill when the user wants to build tool/scripts or achieve a task where using data from the Hugging Face API would help. This is especially useful when chaining or combining API calls or the task will be repeated/automated. This Skill creates a reusable script to fetch, enrich or process data.`
 huggingface-trackio: `Track and visualize ML training experiments with Trackio. Use when logging metrics during training (Python API), firing alerts for training diagnostics, or retrieving/analyzing logged metrics (CLI). Supports real-time dashboard visualization, alerts with webhooks, HF Space syncing, and JSON output for automation.`
 huggingface-vision-trainer: `Trains and fine-tunes vision models for object detection (D-FINE, RT-DETR v2, DETR, YOLOS), image classification (timm models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3 — plus any Transformers classifier), and SAM/SAM2 segmentation using Hugging Face Transformers on Hugging Face Jobs cloud GPUs. Covers COCO-format dataset preparation, Albumentations augmentation, mAP/mAR evaluation, accuracy metrics, SAM segmentation with bbox/point prompts, DiceCE loss, hardware selection, cost estimation, Trackio monitoring, and Hub persistence. Use when users mention training object detection, image classification, SAM, SAM2, segmentation, image matting, DETR, D-FINE, RT-DETR, ViT, timm, MobileNet, ResNet, bounding box models, or fine-tuning vision models on Hugging Face Jobs.`
diff --git a/skills/huggingface-spaces/SKILL.md b/skills/huggingface-spaces/SKILL.md
@@ -0,0 +1,79 @@
+---
+name: huggingface-spaces
+description: Find and call Hugging Face Spaces to generate AI artifacts (images, audio, 3D models, etc). Uses semantic search to discover Spaces, then calls their Gradio APIs.
+user-invocable: true
+allowed-tools: Bash WebFetch Read Write
+argument-hint: <prompt describing what to generate>
+---
+
+# Hugging Face Spaces Tool
+
+You have access to thousands of AI apps hosted on Hugging Face Spaces. Use them to generate artifacts like images, audio, 3D models, videos, text, and more.
+
+## Authentication
+
+Install the latest `hf` CLI and log in:
+
+```bash
+curl -LsSf https://hf.co/cli/install.sh | bash
+hf auth login
+```
+
+Always include the user's HF token in API requests:
+- **REST calls**: `Authorization: Bearer $(hf auth token)`
+- **Python client**: `Client("space-url", hf_token=subprocess.check_output(["hf", "auth", "token"]).decode().strip())`
+
+## Workflow
+
+### Step 1: Find the right Space
+
+Use the `hf` CLI to search for a Space matching the user's request:
+
+```bash
+hf spaces search --sdk gradio "<search query>"
+```
+
+- Always filter by `--sdk gradio` (only Gradio spaces have callable APIs)
+- The output lists Space IDs sorted by relevance with descriptions
+- Prefer spaces that are running and have high trending scores
+- The space domain is derived from the `id`: `owner-spacename.hf.space` (replace `/` with `-`, lowercase)
+
+### Step 2: Call the Space
+
+Fetch the Space's `agents.md` and follow its instructions to call the Space:
+
+```bash
+curl https://huggingface.co/spaces/<owner>/<spacename>/agents.md
+```
+
+This returns a Markdown document with everything needed to call the Space: available endpoints, parameters, input/output types, and usage examples — purpose-built for agents like this one.
+
+### Step 3: Handle the output
+
+- **Files (images, audio, 3D models)**: Download from the returned URL and save locally
+- **Open/play the result**: Use `open <file>` (macOS) or `afplay <file>` (audio)
+- File URLs from Gradio look like: `https://<space>.hf.space/gradio_api/file=<path>`
+
+## Tips
+
+- Read `agents.md` carefully — it often documents exact parameter names, accepted values, and example calls
+- For the Python client, use `handle_file("/path/to/file")` or `handle_file("https://url")` for file/image inputs
+- ZeroGPU spaces have usage quotas — if you get "GPU quota exceeded", wait or try another space
+- Multi-step pipelines (e.g., image-to-3D) often require session state — use the Python client
+- If a user provides a specific Space URL, skip the search step and use it directly
+
+## Examples
+
+**User says**: "generate an image of a sunset"
+1. Search: `hf spaces search --sdk gradio "text to image generation"`
+2. Pick a top result (e.g., `mrfakename/Z-Image-Turbo`)
+3. Fetch `agents.md`, follow its instructions, download and open the result
+
+**User says**: "convert this image to 3D"
+1. Search: `hf spaces search --sdk gradio "image to 3d model"`
+2. Pick a result (e.g., a Trellis space)
+3. Fetch `agents.md` and follow its instructions
+
+**User says**: "say hello world in speech"
+1. Search: `hf spaces search --sdk gradio "text to speech"`
+2. Pick a TTS space, call the generate endpoint, download and play the audio

Original file line number	Diff line number	Diff line change
`@@ -73,6 +73,12 @@`
`73`	`73`	`"source": "./skills/huggingface-vision-trainer",`
`74`	`74`	`"skills": "./",`
`75`	`75`	`"description": "Train and fine-tune object detection models (RTDETRv2, YOLOS, DETR and others) and image classification models (timm and transformers models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3) using Transformers Trainer API on Hugging Face Jobs infrastructure or locally. Includes COCO dataset format support, Albumentations augmentation, mAP/mAR metrics, trackio tracking, hardware selection, and Hub persistence."`
	`76`	`+ },`
	`77`	`+ {`
	`78`	`+ "name": "huggingface-spaces",`
	`79`	`+ "source": "./skills/huggingface-spaces",`
	`80`	`+ "skills": "./",`
	`81`	`+ "description": "Find and call Hugging Face Spaces to generate AI artifacts (images, audio, 3D models, etc). Uses semantic search to discover Spaces, then calls their Gradio APIs."`
`76`	`82`	`}`
`77`	`83`	`]`
`78`	`84`	`}`