You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hugging Face pattern matching and allowedOrganization support (#2041)
* fix(catalog): Forbid a few invalid HF patterns
It's invalid to put a wildcard in the HF org (`foo*bar/`) or omit the
model name (`foo/`).
Signed-off-by: Paul Boyd <[email protected]>
* feat(catalog): wildcard pattern support for Hugging Face
Extends the Hugging Face source to support wildcard patterns like:
- org/* (all models from organization)
- org/prefix* (models with specific prefix)
This was already supported when previewing a source.
Signed-off-by: Paul Boyd <[email protected]>
* feat(catalog): implement allowedOrganization for HF
Signed-off-by: Paul Boyd <[email protected]>
* fix(catalog): clarify HF wildcard docs
Signed-off-by: Paul Boyd <[email protected]>
---------
Signed-off-by: Paul Boyd <[email protected]>
Copy file name to clipboardExpand all lines: catalog/README.md
+86-6Lines changed: 86 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -320,33 +320,113 @@ catalogs:
320
320
enabled: true
321
321
# Required: List of model identifiers to include
322
322
# Format: "organization/model-name" or "username/model-name"
323
+
# Supports wildcard patterns: "organization/*" or "organization/prefix*"
323
324
includedModels:
324
325
- "meta-llama/Llama-3.1-8B-Instruct"
325
-
- "ibm-granite/granite-4.0-h-small"
326
326
- "microsoft/phi-2"
327
-
327
+
- "microsoft/phi-3*" # All models starting with "phi-3"
328
+
328
329
# Optional: Exclude specific models or patterns
329
330
# Supports exact matches or patterns ending with "*"
330
331
excludedModels:
331
332
- "some-org/unwanted-model"
332
333
- "another-org/test-*" # Excludes all models starting with "test-"
333
-
334
+
334
335
# Optional: Configure a custom environment variable name for the API key
335
336
# Defaults to "HF_API_KEY" if not specified
336
337
properties:
337
338
apiKeyEnvVar: "MY_CUSTOM_API_KEY_VAR"
338
339
```
339
340
341
+
#### Organization-Restricted Sources
342
+
343
+
You can restrict a source to only fetch models from a specific organization using the `allowedOrganization` property. This automatically prefixes all model patterns with the organization name:
344
+
345
+
```yaml
346
+
catalogs:
347
+
- name: "Meta LLaMA Models"
348
+
id: "meta-llama-models"
349
+
type: "hf"
350
+
enabled: true
351
+
properties:
352
+
allowedOrganization: "meta-llama"
353
+
apiKeyEnvVar: "HF_API_KEY"
354
+
includedModels:
355
+
# These patterns are automatically prefixed with "meta-llama/"
356
+
- "*" # Expands to: meta-llama/*
357
+
- "Llama-3*" # Expands to: meta-llama/Llama-3*
358
+
- "CodeLlama-*" # Expands to: meta-llama/CodeLlama-*
359
+
excludedModels:
360
+
- "*-4bit" # Excludes: meta-llama/*-4bit
361
+
- "*-GGUF" # Excludes: meta-llama/*-GGUF
362
+
```
363
+
364
+
**Benefits of organization-restricted sources:**
365
+
- **Simplified configuration**: No need to repeat organization name in every pattern
366
+
- **Security**: Prevents accidental inclusion of models from other organizations
367
+
- **Convenience**: Use `"*"` to get all models from an organization
368
+
- **Performance**: Optimized API calls when fetching from a single organization
369
+
340
370
#### Model Filtering
341
371
342
372
Both `includedModels` and `excludedModels` are top-level properties (not nested under `properties`):
343
373
344
-
- **`includedModels`** (required): List of model identifiers to fetch from Hugging Face. Format: `"organization/model-name"` or `"username/model-name"`
374
+
- **`includedModels`** (required): List of model identifiers to fetch from Hugging Face
345
375
- **`excludedModels`** (optional): List of models or patterns to exclude from the results
346
376
347
-
The `excludedModels` property supports:
377
+
#### Supported Pattern Types
378
+
379
+
**Exact Model Names:**
380
+
```yaml
381
+
includedModels:
382
+
- "meta-llama/Llama-3.1-8B-Instruct" # Specific model
383
+
- "microsoft/phi-2" # Specific model
384
+
```
385
+
386
+
**Wildcard Patterns:**
387
+
388
+
In `includedModels`, wildcards can match model names by a prefix.
389
+
390
+
```yaml
391
+
includedModels:
392
+
- "microsoft/phi-*" # All models starting with "phi-"
393
+
- "meta-llama/Llama-3*" # All models starting with "Llama-3"
394
+
- "huggingface/*" # All models from huggingface organization
returnnil, fmt.Errorf("wildcard pattern %q is not supported - Hugging Face requires a specific organization (e.g., 'ibm-granite/*' or 'meta-llama/Llama-2-*')", pattern)
0 commit comments