Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ These recipes are designed to be modular, auditable, and production-ready - with
| [Enforce Upload Time Window](https://github.com/cloudsmith-io/rego-recipes/tree/main?tab=readme-ov-file#recipe-16---suspicious-package-upload-window) | Allow uploads during business hours (9 AM – 5 PM UTC), to catch anomalous behaviour like late-night uploads | Link |
| [Tag-based bypass Exception](https://github.com/cloudsmith-io/rego-recipes/tree/main?tab=readme-ov-file#recipe-17---tag-based-exception-policy) | This is a simple tag-based exception. | Link |
| [Exact allowlist with CVSS limit exemption](https://github.com/cloudsmith-io/rego-recipes/tree/main?tab=readme-ov-file#recipe-18---exact-allowlist-exception-policy-with-cvss-ceiling) | Use when you want tight control per version, but still prevent exemptions if a CVSS exceeds a ceiling. | Link |

| [Huggingface Recipes](https://github.com/cloudsmith-io/rego-recipes/blob/huggingface-recipes/README.md/) | Policies relating to Huggingface models/datasets. | Link |

***

Expand Down
65 changes: 65 additions & 0 deletions huggingface-recipes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Cloudsmith Huggingface EPM Recipes
A curated collection of Enterprise Policy Management (EPM) recipes for Cloudsmith Huggingface models/datasets.
<br/><br/>

These recipes are designed to be modular, auditable, and production-ready - with a strong emphasis on policy-as-code best practices.

***

### Table of Rego Samples

| Name | Description | Rego Playground |
| -------- | ------- | ------- |
| [Enforcing Signed Packages](https://github.com/cloudsmith-io/rego-recipes?tab=readme-ov-file#recipe-1---enforcing-signed-packages) | This policy enforces mandatory ```GPG/DSA signature``` checks on packages during their sync/import into Cloudsmith | Link |

***

### A whitelist of Trusted Publishers

If the package comes from an upstream, block it unless it's on the approved publishers list.
Setup as the policy with an action to QUARANTINE and tag 'untrusted-publisher'.

Download the ```upstream_verified.rego``` and create the associated ```payload.json``` with the below command:
```
TODO
```

***

### Block models that have copyleft licenses or have no license.

Download the ```unknown_or_copyleft_licenses.rego``` and create the associated ```payload.json``` with the below command:
```
TODO
```

***

### Block models with risky file types in it

Block any upstream huggingface package that has risky files in it, particularly pickle or other
risky files like zips, pytorch, keras, and tensorflow h5 models.
Setup with an action to QUARANTINE if the policy matches.
Potentially combine with the whitelist policy to add exceptions to packages as your team see's fit.
package cloudsmith

Download the ```risky_files.rego``` and create the associated ```payload.json``` with the below command:
```
TODO
```

***

### A whitelist for particular models - as an override to prior policies.

A final policy that acts as a whitelist or exception-based policy. The prior policies
might have quarantined the package but you want to encode some known exceptions.
Setup this policy with as terminal and with the action ALLOW and an action to tag the
package as 'whitelisted'

Download the ```whitelisted_pkgs.rego``` and create the associated ```payload.json``` with the below command:
```
TODO
```

***
20 changes: 20 additions & 0 deletions huggingface-recipes/risky_files.rego
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
package cloudsmith

import rego.v1

default match := false

pkg := input.v0.package

hf_pkg if "huggingface" == pkg.format

is_upstream_pkg if pkg.uploader.slug_perm == "Cloudsmith"

risky_file_extensions := {".bin", ".h5", ".keras", ".pkl", "pt", ".zip",}

match if {
hf_pkg
is_upstream_pkg
some file in pkg.files
file.file_extension in risky_file_extensions
}
50 changes: 50 additions & 0 deletions huggingface-recipes/unknown_or_copyleft_licenses.rego
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
package cloudsmith

default match := false

# A list of copy-left SPDX identifiers.
copyleft_ids := {
"lgpl-3.0",
"agpl-3.0-only",
"qpl-1.0",
"gpl-2.0-or-later",
"cpol-1.02",
"lgpl-2.1",
"agpl-3.0-or-later",
"gpl-2.0",
"ngpl",
"agpl-3.0",
"gpl-3.0",
"gpl-3.0-or-later",
"sleepycat",
"gpl-3.0-only",
"osl-3.0",
"gpl-2.0-only",
"apache-1.1",
}

pkg := input.v0.package
hf_pkg if "huggingface" == pkg.format

copy_left_msg contains msg if {
some id in copyleft_ids
id == pkg.license.oss_license.spdx_identifier
msg := sprintf("License '%s' is considered copyleft", [pkg.license.oss_license.spdx_identifier])
}

missing_license contains msg if {
not pkg.license.oss_license.spdx_identifier
msg := "No license specified"
}


# Or statement, if the license is copy-left or missing an identified license.
match if {
hf_pkg
count(copy_left_msg) > 0
}

match if {
hf_pkg
count(missing_license) > 0
}
17 changes: 17 additions & 0 deletions huggingface-recipes/upstream_verified.rego
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package cloudsmith

import rego.v1

default match := false

is_upstream_pkg if input.v0.package.uploader.slug_perm == "Cloudsmith"

verified_publishers := {"amazon", "apple", "facebook", "FacebookAI", "google", "Intel", "microsoft", "openai"}

publisher := split(input.v0.package.name, "/")[0]

match if {
huggingface" == input.v0.package.format
is_upstream_pkg
not publisher in verified_publishers
}
12 changes: 12 additions & 0 deletions huggingface-recipes/whitelisted_pkgs.rego
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
package cloudsmith

import rego.v1

default match := false

white_listed_pkgs := {"jinaai/jina-embeddings-v3", "Qwen/Qwen3-0.6B"}

match if {
"huggingface" == input.v0.package.format
input.v0.package.name in white_listed_pkgs
}