-
-
Notifications
You must be signed in to change notification settings - Fork 681
Huggingface revision pinning #1281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In much the same way as unpinned container images benefit from digest pinning, fixing a model, dataset or file to a revision digest uniquely and immutably fixes use to a paricular model snapshot (commit)
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to go after noted fixes.
- Add an entry for CWE 494 - Use string.hexdigits - Set to 18.6 release - Remove Copywright - Order after markupsafe
bandit/core/issue.py
Outdated
@@ -31,6 +31,7 @@ class Cwe: | |||
IMPROPER_CHECK_OF_EXCEPT_COND = 703 | |||
INCORRECT_PERMISSION_ASSIGNMENT = 732 | |||
INAPPROPRIATE_ENCODING_FOR_OUTPUT_CONTEXT = 838 | |||
DOWNLOAD_OF_CODE_WITHOUT_INTEGRITY_CHECK = 494 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep these sorted by number. Makes it easier to detect whether we already have a constant for that CWE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
for more information, see https://pre-commit.ci
# Commit hashes: 40 chars (full SHA) or 7+ chars (short SHA) | ||
if isinstance(revision_to_check, str): | ||
# Remove quotes if present | ||
revision_str = str(revision_to_check).strip("\"'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might result in some unexpected results in the case of more complicated string formation. Like f"{foo} '{bar}' """
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will iterate on this some more, thanks for the quick reviews! much appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good enough to merge IMO
What happens for library code that accepts user-supplied revisions? Think of any function wrapping def load_dataset_wrapped(dataset_name, dataset_kwargs):
return load_dataset(
self.dataset_name,
revision=dataset_kwargs["revision"],
... ? cc @ElenaKhaustova (Please let us know if we should rather open a new issue about this) |
In much the same way as unpinned container images benefit from digest pinning, fixing a model, dataset or file to a revision digest uniquely and immutably fixes use to a particular model snapshot (commit)
Example run: