Skip to content

Commit 070d5d6

Browse files
authored
Add windows machine for CI (#1112)
* Add windows machine for CI * add simple windows test * run only windows tests in CI * remove apt get in windows ci * disable symlinks on windows machine in CI * env variale * fix disable_symlinks_on_windows_ci fixture * fix fixture ? * try to fix permission issues on windows ci * refacto Offline repository tests * Update use_tmp_repo behavior * tmp work * split between shared vs unique repo tests * working repository tests * use credential for repo tests * refacto dataset repository tests * run repo test concurrently * fix github workflow * fix ci * fix ci * echo ci * fix ci? * finally fix ci?g * make test_correct_helper more robust * delete test_correct_helper * fix commit context manager * robust tmp dir deletiong * Implement more robust TemporaryDirectory * adapt existing codebase * cleanup at last * code quality * Rename to softTemporaryDirectory * Skip test on windows + force str path in Repository * typing * skip symlink test on windows * always disable symlink on windows during tests * fix pytest conf * xfail_on_windows decorator * cleaner xfail_on_windows * deactivate more tests * another xfail * skip umask test * add pending questions * other xfail * longer but more robust test_backoff_sleep_time test * another xfail * Make windows-specific version of test_scan_cache_on_valid_cache * fix file size * 2 cache tests should fail * resolve path in test missing cache * test_scan_cache_last_modified_and_last_accessed should fail on Windows * string instead of path in test_init_from_existing_local_clone * fix \r\n issue in modelcards on windows * preserve newlines when saving Repocard * fix linebreak issue in test_updating_text_updates_content * add repocard test to ensure linebreaks are kept * fix for 3.7 * fix flag when saving * document known issues on windows * requested change
1 parent 5b4fa6a commit 070d5d6

20 files changed

+312
-75
lines changed

.github/workflows/python-tests.yml

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ env:
1616
HUGGINGFACE_PRODUCTION_USER_TOKEN: ${{ secrets.HUGGINGFACE_PRODUCTION_USER_TOKEN }}
1717

1818
jobs:
19-
build:
19+
build-ubuntu:
2020
runs-on: ubuntu-latest
2121
strategy:
2222
fail-fast: false
@@ -123,3 +123,41 @@ jobs:
123123
with:
124124
files: ./coverage.xml
125125
verbose: true
126+
127+
build-windows:
128+
# (almost) Duplicate config compared to `build-ubuntu` but running on Windows.
129+
# Please make sure to keep it updated as well.
130+
# Lighter version of the tests with only 1 test suite running on Python3.7.
131+
runs-on: windows-latest
132+
env:
133+
DISABLE_SYMLINKS_IN_WINDOWS_TESTS: 1
134+
strategy:
135+
fail-fast: false
136+
matrix:
137+
python-version: ["3.7"]
138+
139+
steps:
140+
- uses: actions/checkout@v2
141+
- name: Set up Python ${{ matrix.python-version }}
142+
uses: actions/setup-python@v2
143+
with:
144+
python-version: ${{ matrix.python-version }}
145+
146+
# Install dependencies
147+
- name: Configure and install dependencies
148+
run: |
149+
pip install --upgrade pip
150+
pip install .[testing]
151+
# sudo apt install -y libsndfile1-dev
152+
153+
# Run tests
154+
- name: Run tests
155+
working-directory: ./src # For code coverage to work
156+
run: python -m pytest --cov=./huggingface_hub --cov-report=xml:../coverage.xml ../tests
157+
158+
# Upload code coverage
159+
- name: Upload coverage reports to Codecov with GitHub Action
160+
uses: codecov/codecov-action@v3
161+
with:
162+
files: ./coverage.xml
163+
verbose: true

codecov.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ comment:
22
# https://docs.codecov.com/docs/pull-request-comments#requiring-changes
33
require_changes: true
44
# https://docs.codecov.com/docs/pull-request-comments#after_n_builds
5-
after_n_builds: 11
5+
after_n_builds: 12
66

77
github_checks:
88
annotations: false

docs/source/installation.mdx

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,25 @@ Tags: ['pytorch', 'tf', 'jax', 'tflite', 'rust', 'safetensors', 'gpt2', 'text-ge
131131
Task: text-generation
132132
```
133133

134+
## Windows limitations
135+
136+
With our goal of democratizing good ML everywhere, we built `huggingface_hub` to be a
137+
cross-platform library and in particular to work correctly on both Unix-based and Windows
138+
systems. However, there are a few cases where `huggingface_hub` has some limitations when
139+
run on Windows. Here is an exhaustive list of known issues. Please let us know if you
140+
encounter any undocumented problem by opening [an issue on Github](https://github.com/huggingface/huggingface_hub/issues/new/choose).
141+
142+
- `huggingface_hub`'s cache system relies on symlinks to efficiently cache files downloaded
143+
from the Hub. On Windows, you must activate developer mode or run your script as admin to
144+
enable symlinks. If they are not activated, the cache-system still works but in an non-optimized
145+
manner. Please read [this page](./how-to-cache#limitations) for more details.
146+
- Filepaths on the Hub can have special characters (e.g. `"path/to?/my/file"`). Windows is
147+
more restrictive on [special characters](https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-used-in-file-names)
148+
which makes it impossible to download those files on Windows. Hopefully this is a rare case.
149+
Please reach out to the repo owner if you think this is a mistake or to us to figure out
150+
a solution.
151+
152+
134153
## Next steps
135154

136155
Once `huggingface_hub` is properly installed on your machine, you might want

setup.cfg

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,4 +60,5 @@ max-line-length = 88
6060
# --durations=0 -> print execution time of each test
6161
addopts = -Werror::FutureWarning --log-cli-level=INFO -sv --durations=0
6262
env =
63-
HUGGINGFACE_CO_STAGING=1
63+
HUGGINGFACE_CO_STAGING=1
64+
DISABLE_SYMLINKS_IN_WINDOWS_TESTS=1

src/huggingface_hub/repocard.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,8 @@
2929
)
3030

3131
# exact same regex as in the Hub server. Please keep in sync.
32-
REGEX_YAML_BLOCK = re.compile(r"^(\s*---[\n\r]+)([\S\s]*?)([\n\r]+---([\n\r]|$))")
32+
# See https://github.com/huggingface/moon-landing/blob/main/server/lib/ViewMarkdown.ts#L18
33+
REGEX_YAML_BLOCK = re.compile(r"^(\s*---[\r\n]+)([\S\s]*?)([\r\n]+---(\r\n|\n|$))")
3334

3435
logger = get_logger(__name__)
3536

@@ -127,7 +128,9 @@ def save(self, filepath: Union[Path, str]):
127128
"""
128129
filepath = Path(filepath)
129130
filepath.parent.mkdir(parents=True, exist_ok=True)
130-
filepath.write_text(str(self), encoding="utf-8")
131+
# Preserve newlines as in the existing file.
132+
with open(filepath, mode="w", newline="", encoding="utf-8") as f:
133+
f.write(str(self))
131134

132135
@classmethod
133136
def load(
@@ -516,11 +519,11 @@ def metadata_save(local_path: Union[str, Path], data: Dict) -> None:
516519
# try to detect existing newline character
517520
if os.path.exists(local_path):
518521
with open(local_path, "r", newline="") as readme:
519-
if type(readme.newlines) is tuple:
522+
content = readme.read()
523+
if isinstance(readme.newlines, tuple):
520524
line_break = readme.newlines[0]
521-
if type(readme.newlines) is str:
525+
elif isinstance(readme.newlines, str):
522526
line_break = readme.newlines
523-
content = readme.read()
524527

525528
# creates a new file if it not
526529
with open(local_path, "w", newline="") as readme:

src/huggingface_hub/repository.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -438,7 +438,7 @@ class Repository:
438438
@validate_hf_hub_args
439439
def __init__(
440440
self,
441-
local_dir: str,
441+
local_dir: Union[str, Path],
442442
clone_from: Optional[str] = None,
443443
repo_type: Optional[str] = None,
444444
token: Union[bool, str] = True,
@@ -459,7 +459,7 @@ def __init__(
459459
or the `git_user`/`git_email` pair will be used instead.
460460
461461
Args:
462-
local_dir (`str`):
462+
local_dir (`str` or `Path`):
463463
path (e.g. `'my_trained_model/'`) to the local directory, where
464464
the `Repository` will be initialized.
465465
clone_from (`str`, *optional*):
@@ -494,7 +494,8 @@ def __init__(
494494
- [`EnvironmentError`](https://docs.python.org/3/library/exceptions.html#EnvironmentError)
495495
if the remote repository set in `clone_from` does not exist.
496496
"""
497-
497+
if isinstance(local_dir, Path):
498+
local_dir = str(local_dir)
498499
os.makedirs(local_dir, exist_ok=True)
499500
self.local_dir = os.path.join(os.getcwd(), local_dir)
500501
self._repo_type = repo_type

tests/conftest.py

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,17 @@
1+
import os
2+
import shutil
13
from pathlib import Path
24
from typing import Generator
35

46
import pytest
57

8+
import huggingface_hub
69
from _pytest.fixtures import SubRequest
710
from huggingface_hub import HfApi, HfFolder
811
from huggingface_hub.utils import SoftTemporaryDirectory
912

1013
from .testing_constants import ENDPOINT_PRODUCTION, PRODUCTION_TOKEN
11-
from .testing_utils import repo_name
14+
from .testing_utils import repo_name, set_write_permission_and_retry
1215

1316

1417
@pytest.fixture
@@ -28,6 +31,9 @@ def test_cache_dir(self) -> None:
2831
with SoftTemporaryDirectory() as cache_dir:
2932
request.cls.cache_dir = Path(cache_dir).resolve()
3033
yield
34+
# TemporaryDirectory is not super robust on Windows when a git repository is
35+
# cloned in it. See https://www.scivision.dev/python-tempfile-permission-error-windows/.
36+
shutil.rmtree(cache_dir, onerror=set_write_permission_and_retry)
3137

3238

3339
@pytest.fixture(autouse=True, scope="session")
@@ -47,6 +53,23 @@ def clean_hf_folder_token_for_tests() -> Generator:
4753
HfFolder().save_token(token)
4854

4955

56+
@pytest.fixture(autouse=True)
57+
def disable_symlinks_on_windows_ci(monkeypatch: pytest.MonkeyPatch) -> None:
58+
class FakeSymlinkDict(dict):
59+
def __contains__(self, __o: object) -> bool:
60+
return True # consider any `cache_dir` to be already checked
61+
62+
def __getitem__(self, __key: str) -> bool:
63+
return False # symlinks are never supported
64+
65+
if os.name == "nt" and os.environ.get("DISABLE_SYMLINKS_IN_WINDOWS_TESTS"):
66+
monkeypatch.setattr(
67+
huggingface_hub.file_download,
68+
"_are_symlinks_supported_in_dir",
69+
FakeSymlinkDict(),
70+
)
71+
72+
5073
@pytest.fixture
5174
def fx_production_space(request: SubRequest) -> Generator[None, None, None]:
5275
"""Add a `repo_id` attribute referencing a Space repo on the production Hub.

tests/test_cache_layout.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,12 @@
88
from huggingface_hub.utils._errors import EntryNotFoundError
99

1010
from .testing_constants import ENDPOINT_STAGING, TOKEN, USER
11-
from .testing_utils import expect_deprecation, repo_name, with_production_testing
11+
from .testing_utils import (
12+
expect_deprecation,
13+
repo_name,
14+
with_production_testing,
15+
xfail_on_windows,
16+
)
1217

1318

1419
logger = logging.get_logger(__name__)
@@ -24,6 +29,7 @@ def get_file_contents(path):
2429

2530
@with_production_testing
2631
class CacheFileLayoutHfHubDownload(unittest.TestCase):
32+
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
2733
def test_file_downloaded_in_cache(self):
2834
for revision, expected_reference in (
2935
(None, "main"),
@@ -136,6 +142,7 @@ def test_file_download_happens_once(self):
136142

137143
self.assertEqual(creation_time_0, creation_time_1)
138144

145+
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
139146
def test_file_download_happens_once_intra_revision(self):
140147
# Tests that a file is only downloaded once if it's not updated, even across different revisions.
141148

@@ -152,6 +159,7 @@ def test_file_download_happens_once_intra_revision(self):
152159

153160
self.assertEqual(creation_time_0, creation_time_1)
154161

162+
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
155163
def test_multiple_refs_for_same_file(self):
156164
with SoftTemporaryDirectory() as cache:
157165
hf_hub_download(MODEL_IDENTIFIER, "file_0.txt", cache_dir=cache)
@@ -192,6 +200,7 @@ def test_multiple_refs_for_same_file(self):
192200

193201
@with_production_testing
194202
class CacheFileLayoutSnapshotDownload(unittest.TestCase):
203+
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
195204
def test_file_downloaded_in_cache(self):
196205
with SoftTemporaryDirectory() as cache:
197206
snapshot_download(MODEL_IDENTIFIER, cache_dir=cache)
@@ -230,6 +239,7 @@ def test_file_downloaded_in_cache(self):
230239

231240
self.assertTrue(all([os.path.isfile(l) for l in resolved_snapshot_links]))
232241

242+
@xfail_on_windows(reason="Symlinks are deactivated in Windows tests.")
233243
def test_file_downloaded_in_cache_several_revisions(self):
234244
with SoftTemporaryDirectory() as cache:
235245
snapshot_download(MODEL_IDENTIFIER, cache_dir=cache, revision="file-3")

tests/test_fastai_integration.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
import os
2-
import shutil
32
from unittest import TestCase, skip
43

54
from huggingface_hub import HfApi
@@ -15,7 +14,7 @@
1514
)
1615

1716
from .testing_constants import ENDPOINT_STAGING, TOKEN, USER
18-
from .testing_utils import expect_deprecation, repo_name, set_write_permission_and_retry
17+
from .testing_utils import expect_deprecation, repo_name, rmtree_with_retry
1918

2019

2120
WORKING_REPO_SUBDIR = f"fixtures/working_repo_{__name__.split('.')[-1]}"
@@ -76,7 +75,7 @@ def setUpClass(cls):
7675

7776
def tearDown(self) -> None:
7877
try:
79-
shutil.rmtree(WORKING_REPO_DIR, onerror=set_write_permission_and_retry)
78+
rmtree_with_retry(WORKING_REPO_DIR)
8079
except FileNotFoundError:
8180
pass
8281

tests/test_file_download.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
OfflineSimulationMode,
5858
offline,
5959
with_production_testing,
60+
xfail_on_windows,
6061
)
6162

6263

@@ -243,6 +244,7 @@ def test_dataset_lfs_object(self):
243244
(url, '"95aa6a52d5d6a735563366753ca50492a658031da74f301ac5238b03966972c9"'),
244245
)
245246

247+
@xfail_on_windows(reason="umask is UNIX-specific")
246248
def test_hf_hub_download_custom_cache_permission(self):
247249
"""Checks `hf_hub_download` respect the cache dir permission.
248250
@@ -473,13 +475,15 @@ def test_hf_hub_url_on_awful_subfolder_and_filename(self):
473475
self.expected_url,
474476
)
475477

478+
@xfail_on_windows(reason="Windows paths cannot contain a '?'.")
476479
def test_hf_hub_download_on_awful_filepath(self):
477480
local_path = hf_hub_download(
478481
self.repo_id, self.filepath, cache_dir=self.cache_dir
479482
)
480483
# Local path is not url-encoded
481484
self.assertTrue(local_path.endswith(self.filepath))
482485

486+
@xfail_on_windows(reason="Windows paths cannot contain a '?'.")
483487
def test_hf_hub_download_on_awful_subfolder_and_filename(self):
484488
local_path = hf_hub_download(
485489
self.repo_id,
@@ -492,6 +496,7 @@ def test_hf_hub_download_on_awful_subfolder_and_filename(self):
492496

493497

494498
class CreateSymlinkTest(unittest.TestCase):
499+
@unittest.skipIf(os.name == "nt", "No symlinks on Windows")
495500
@patch("huggingface_hub.file_download.are_symlinks_supported")
496501
def test_create_relative_symlink_concurrent_access(
497502
self, mock_are_symlinks_supported: Mock

0 commit comments

Comments
 (0)