Skip to content

Commit 326df9d

Browse files
committed
feat(cli): add Hugging Face integration to discover,search command
1 parent 7d73150 commit 326df9d

File tree

5 files changed

+476
-53
lines changed

5 files changed

+476
-53
lines changed

docs/cli/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ locallab models info microsoft/phi-2
5353
| `locallab models list` | List locally cached models | [Model Management](../guides/model-management.md#list-models) |
5454
| `locallab models download <model_id>` | Download a model locally | [Model Management](../guides/model-management.md#download-models) |
5555
| `locallab models remove <model_id>` | Remove a cached model | [Model Management](../guides/model-management.md#remove-models) |
56-
| `locallab models discover` | Discover available models | [Model Management](../guides/model-management.md#discover-models) |
56+
| `locallab models discover` | Discover models from registry and HuggingFace Hub | [Model Management](../guides/model-management.md#discover-models) |
5757
| `locallab models info <model_id>` | Show detailed model information | [Model Management](../guides/model-management.md#model-info) |
5858
| `locallab models clean` | Clean up orphaned cache files | [Model Management](../guides/model-management.md#cache-cleanup) |
5959
| `locallab logs` | View server logs | [CLI Guide](../guides/cli.md#view-logs) |
@@ -74,8 +74,8 @@ LocalLab includes comprehensive model management capabilities to help you downlo
7474
### Common Workflows
7575

7676
```bash
77-
# Discover and download a new model
78-
locallab models discover --search "phi"
77+
# Discover models from registry and HuggingFace Hub
78+
locallab models discover --search "code generation"
7979
locallab models download microsoft/phi-2
8080

8181
# Check model details before using

docs/guides/cli.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,9 +125,15 @@ LocalLab includes comprehensive model management capabilities to help you downlo
125125
### Quick Examples
126126
127127
```bash
128-
# Discover available models
128+
# Discover models from registry and HuggingFace Hub
129129
locallab models discover
130130
131+
# Search for specific types of models
132+
locallab models discover --search "code generation"
133+
134+
# Filter by tags
135+
locallab models discover --tags "conversational,chat"
136+
131137
# Download a model for faster startup
132138
locallab models download microsoft/phi-2
133139
@@ -148,7 +154,7 @@ locallab models clean
148154
| `locallab models list` | List locally cached models |
149155
| `locallab models download <model_id>` | Download a model locally |
150156
| `locallab models remove <model_id>` | Remove a cached model |
151-
| `locallab models discover` | Discover available models |
157+
| `locallab models discover` | Discover models from registry and HuggingFace Hub |
152158
| `locallab models info <model_id>` | Show detailed model information |
153159
| `locallab models clean` | Clean up orphaned cache files |
154160

docs/guides/model-management.md

Lines changed: 27 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -131,28 +131,47 @@ locallab models discover [OPTIONS]
131131
```
132132

133133
**Options:**
134-
- `--search <term>` - Search models by name or description
134+
- `--search <keywords>` - Search models by keywords, tags, or description
135135
- `--limit <number>` - Maximum number of models to show (default: 20)
136136
- `--format [table|json]` - Output format (default: table)
137+
- `--registry-only` - Show only LocalLab registry models
138+
- `--hub-only` - Show only HuggingFace Hub models
139+
- `--sort [downloads|likes|recent]` - Sort HuggingFace models by popularity or recency
140+
- `--tags <tags>` - Filter by comma-separated tags (e.g., "conversational,chat")
137141

138142
**Examples:**
139143
```bash
140-
# Discover all available models
144+
# Discover all available models (registry + HuggingFace Hub)
141145
locallab models discover
142146

143-
# Search for specific models
144-
locallab models discover --search "dialog"
147+
# Search for specific models across all sources
148+
locallab models discover --search "code generation"
145149

146-
# Limit results and export as JSON
147-
locallab models discover --limit 10 --format json
150+
# Search only in LocalLab registry
151+
locallab models discover --search "phi" --registry-only
152+
153+
# Find models by tags
154+
locallab models discover --tags "conversational,chat" --limit 10
155+
156+
# Get popular models sorted by downloads
157+
locallab models discover --sort downloads --limit 15
158+
159+
# Export results as JSON for processing
160+
locallab models discover --format json --limit 5
148161
```
149162

150163
**Information shown:**
151164
- Model ID and name
152-
- Model size and type
165+
- Model size and type (Registry/HuggingFace)
166+
- Download count and popularity metrics
153167
- Cache status (cached/available)
154168
- Brief description
155-
- Download availability
169+
- Author information
170+
171+
**Network Requirements:**
172+
- Registry models: Always available (offline)
173+
- HuggingFace Hub models: Requires internet connection
174+
- Graceful fallback when HuggingFace Hub is unavailable
156175

157176
### Model Information {#model-info}
158177

locallab/cli/models.py

Lines changed: 168 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
from ..utils.system import format_model_size, get_system_resources
2020
from ..utils.progress import configure_hf_hub_progress
2121
from ..utils.model_cache import model_cache_manager
22+
from ..utils.huggingface_search import hf_searcher
2223
from ..logger.logger import logger
2324

2425
# Initialize rich console
@@ -203,89 +204,216 @@ def remove(model_id: str, force: bool):
203204
sys.exit(1)
204205

205206
@models.command()
206-
@click.option('--search', help='Search models by name or description')
207-
@click.option('--limit', type=int, default=20, help='Maximum number of models to show')
207+
@click.option('--search', help='Search models by keywords, tags, or description')
208+
@click.option('--limit', type=int, default=20, help='Maximum number of models to show (default: 20)')
208209
@click.option('--format', 'output_format', type=click.Choice(['table', 'json']), default='table',
209210
help='Output format (table or json)')
210-
def discover(search: Optional[str], limit: int, output_format: str):
211-
"""Discover available models from HuggingFace Hub and registry"""
211+
@click.option('--registry-only', is_flag=True, help='Show only LocalLab registry models')
212+
@click.option('--hub-only', is_flag=True, help='Show only HuggingFace Hub models')
213+
@click.option('--sort', type=click.Choice(['downloads', 'likes', 'recent']), default='downloads',
214+
help='Sort HuggingFace models by downloads, likes, or recent updates')
215+
@click.option('--tags', help='Filter by tags (comma-separated, e.g., "conversational,chat")')
216+
def discover(search: Optional[str], limit: int, output_format: str, registry_only: bool,
217+
hub_only: bool, sort: str, tags: Optional[str]):
218+
"""Discover available models from HuggingFace Hub and LocalLab registry"""
212219
try:
213220
console.print("🔍 Discovering available models...", style="blue")
214221

215-
# Start with registry models
216-
available_models = []
222+
all_models = []
217223

218-
# Add registry models
219-
for model_id, config in MODEL_REGISTRY.items():
220-
model_info = {
221-
"id": model_id,
222-
"name": config.get("name", model_id),
223-
"description": config.get("description", ""),
224-
"size": config.get("size", "Unknown"),
225-
"type": "Registry",
226-
"requirements": config.get("requirements", {}),
227-
"is_cached": False
228-
}
229-
available_models.append(model_info)
224+
# Get registry models (unless hub-only is specified)
225+
if not hub_only:
226+
registry_models = _get_registry_models()
227+
all_models.extend(registry_models)
228+
console.print(f"📚 Found {len(registry_models)} LocalLab registry models", style="dim")
229+
230+
# Get HuggingFace Hub models (unless registry-only is specified)
231+
if not registry_only:
232+
hf_models, hf_success = _get_huggingface_models(search, limit, sort, tags)
233+
if hf_success:
234+
all_models.extend(hf_models)
235+
console.print(f"🤗 Found {len(hf_models)} HuggingFace Hub models", style="dim")
236+
else:
237+
console.print("⚠️ Could not search HuggingFace Hub (network issue or missing dependencies)", style="yellow")
238+
if search or tags:
239+
console.print("💡 Try using --registry-only to search LocalLab registry models only.", style="dim")
240+
241+
# Apply search filter to registry models if search is specified
242+
if search and not hub_only:
243+
search_lower = search.lower()
244+
registry_filtered = [
245+
m for m in all_models
246+
if m.get("type") == "Registry" and (
247+
search_lower in m["name"].lower() or
248+
search_lower in m["description"].lower()
249+
)
250+
]
251+
# Keep HF models and filtered registry models
252+
hf_models_in_list = [m for m in all_models if m.get("type") == "HuggingFace"]
253+
all_models = registry_filtered + hf_models_in_list
230254

231255
# Check which models are already cached
232256
cached_models = model_cache_manager.get_cached_models()
233257
cached_ids = {m["id"] for m in cached_models}
234258

235-
for model in available_models:
259+
for model in all_models:
236260
model["is_cached"] = model["id"] in cached_ids
237261

238-
# Apply search filter
239-
if search:
240-
search_lower = search.lower()
241-
available_models = [
242-
m for m in available_models
243-
if search_lower in m["name"].lower() or search_lower in m["description"].lower()
244-
]
262+
# Sort models: Registry first, then by specified sort order
263+
all_models.sort(key=lambda x: (
264+
0 if x.get("type") == "Registry" else 1, # Registry models first
265+
-x.get("downloads", 0) if sort == "downloads" else 0,
266+
-x.get("likes", 0) if sort == "likes" else 0,
267+
x.get("updated_at", "") if sort == "recent" else ""
268+
))
245269

246-
# Limit results
247-
available_models = available_models[:limit]
270+
# Apply final limit
271+
all_models = all_models[:limit]
248272

249273
if output_format == 'json':
250-
click.echo(json.dumps(available_models, indent=2))
274+
click.echo(json.dumps(all_models, indent=2))
251275
return
252276

253-
if not available_models:
277+
if not all_models:
254278
console.print("📭 No models found matching your criteria.", style="yellow")
279+
if not registry_only:
280+
console.print("💡 Try adjusting your search terms or check your internet connection.", style="dim")
255281
return
256282

257283
# Create table
258284
table = Table(title="🌟 Available Models")
259-
table.add_column("Model ID", style="cyan", no_wrap=True)
260-
table.add_column("Name", style="green")
261-
table.add_column("Size", style="magenta")
285+
table.add_column("Model ID", style="cyan", no_wrap=True, max_width=30)
286+
table.add_column("Name", style="green", max_width=20)
287+
table.add_column("Size", style="magenta", justify="right")
262288
table.add_column("Type", style="blue")
263-
table.add_column("Status", style="yellow")
264-
table.add_column("Description", style="dim")
289+
table.add_column("Downloads", style="yellow", justify="right")
290+
table.add_column("Status", style="bright_green")
291+
table.add_column("Description", style="dim", max_width=40)
265292

266-
for model in available_models:
293+
for model in all_models:
267294
status = "✅ Cached" if model["is_cached"] else "📥 Available"
295+
downloads_str = ""
296+
if model.get("downloads", 0) > 0:
297+
downloads = model["downloads"]
298+
if downloads >= 1000000:
299+
downloads_str = f"{downloads/1000000:.1f}M"
300+
elif downloads >= 1000:
301+
downloads_str = f"{downloads/1000:.1f}K"
302+
else:
303+
downloads_str = str(downloads)
304+
268305
table.add_row(
269306
model["id"],
270307
model["name"],
271-
model["size"],
272-
model["type"],
308+
model.get("size", "Unknown"),
309+
model.get("type", "Unknown"),
310+
downloads_str,
273311
status,
274312
model["description"][:50] + "..." if len(model["description"]) > 50 else model["description"]
275313
)
276314

277315
console.print(table)
278316

279317
# Show summary
280-
cached_count = sum(1 for m in available_models if m["is_cached"])
281-
console.print(f"\n📊 Found {len(available_models)} models ({cached_count} cached, {len(available_models) - cached_count} available for download)")
318+
cached_count = sum(1 for m in all_models if m["is_cached"])
319+
registry_count = sum(1 for m in all_models if m.get("type") == "Registry")
320+
hf_count = len(all_models) - registry_count
321+
322+
console.print(f"\n📊 Found {len(all_models)} models:")
323+
console.print(f" • {registry_count} LocalLab registry models")
324+
console.print(f" • {hf_count} HuggingFace Hub models")
325+
console.print(f" • {cached_count} already cached locally")
282326
console.print("\n💡 Use 'locallab models download <model_id>' to download a model locally.")
283327

328+
if not registry_only and hf_count > 0:
329+
console.print("🔍 Use --search to find specific models or --tags to filter by categories.")
330+
284331
except Exception as e:
285332
logger.error(f"Error discovering models: {e}")
286333
console.print(f"❌ Error discovering models: {str(e)}", style="red")
287334
sys.exit(1)
288335

336+
def _get_registry_models():
337+
"""Get models from LocalLab registry"""
338+
registry_models = []
339+
340+
for model_id, config in MODEL_REGISTRY.items():
341+
model_info = {
342+
"id": model_id,
343+
"name": config.get("name", model_id),
344+
"description": config.get("description", "LocalLab registry model"),
345+
"size": config.get("size", "Unknown"),
346+
"type": "Registry",
347+
"downloads": 0, # Registry models don't have download counts
348+
"likes": 0,
349+
"requirements": config.get("requirements", {}),
350+
"is_cached": False,
351+
"tags": [],
352+
"author": "LocalLab",
353+
"updated_at": ""
354+
}
355+
registry_models.append(model_info)
356+
357+
return registry_models
358+
359+
def _get_huggingface_models(search: Optional[str], limit: int, sort: str, tags: Optional[str]):
360+
"""Get models from HuggingFace Hub"""
361+
try:
362+
# Parse tags if provided
363+
tag_list = []
364+
if tags:
365+
tag_list = [tag.strip() for tag in tags.split(',') if tag.strip()]
366+
367+
# Convert sort parameter
368+
hf_sort = "downloads"
369+
if sort == "likes":
370+
hf_sort = "likes"
371+
elif sort == "recent":
372+
hf_sort = "lastModified"
373+
374+
# Search HuggingFace Hub
375+
if search:
376+
hf_models, success = hf_searcher.search_models(
377+
search_query=search, limit=limit, sort=hf_sort
378+
)
379+
elif tag_list:
380+
hf_models, success = hf_searcher.search_models(
381+
search_query=None, limit=limit, sort=hf_sort, filter_tags=tag_list
382+
)
383+
else:
384+
hf_models, success = hf_searcher.search_models(
385+
search_query=None, limit=limit, sort=hf_sort
386+
)
387+
388+
if not success:
389+
return [], False
390+
391+
# Convert to our format
392+
converted_models = []
393+
for hf_model in hf_models:
394+
model_info = {
395+
"id": hf_model.id,
396+
"name": hf_model.name,
397+
"description": hf_model.description,
398+
"size": hf_model.size_formatted,
399+
"type": "HuggingFace",
400+
"downloads": hf_model.downloads,
401+
"likes": hf_model.likes,
402+
"is_cached": False,
403+
"tags": hf_model.tags,
404+
"author": hf_model.author,
405+
"updated_at": hf_model.updated_at or "",
406+
"pipeline_tag": hf_model.pipeline_tag,
407+
"library_name": hf_model.library_name
408+
}
409+
converted_models.append(model_info)
410+
411+
return converted_models, True
412+
413+
except Exception as e:
414+
logger.debug(f"Error getting HuggingFace models: {e}")
415+
return [], False
416+
289417
@models.command()
290418
@click.argument('model_id')
291419
def info(model_id: str):

0 commit comments

Comments
 (0)