Skip to content

Conversation

@stephantul
Copy link
Contributor

Closes #234

This PR adds support for cloudpathlib indirectly.

@codecov
Copy link

codecov bot commented May 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
model2vec/hf_utils.py 75.28% <100.00%> (+0.28%) ⬆️
model2vec/model.py 94.23% <100.00%> (+0.07%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@sybrenjansen sybrenjansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it fails when loading the Tokenizer. I guess, because it's rust based, it doesn't take cloudpaths :/ Not sure what is the best approach here.


if isinstance(path, Path):
# Only check if we're sure this is a path.
# It could be a cloudpathlib path, or something else.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this if statement path is always a Path. I.e., AnyPath doesn't have Path as base class, so this comment isn't true

@stephantul
Copy link
Contributor Author

@sybrenjansen if tokenizer loading fails using a CloudPath, I think it's best to just download the folder and load everything using from_pretrained.

@sybrenjansen
Copy link

Yeah, I guess that's the only way. Oh well. Thnx anyway

@stephantul stephantul closed this May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support cloud paths in from_pretrained

3 participants