Skip to content

Commit b21cac1

Browse files
authored
Implement HTTP(s) support (#468)
* intial implementation of http * remove unused func * add https support * Change dir detection to / * lint * Update https tests * make sigs match * Add parsed_url * Add tests for verb methods * lint * test parsed_url * test preserved properties * spread out ports; fix warnings * lint * fix full_match * sleepy upload test * docs wip * Update docs * lint * improve http docs * add table * lint * try skipping http rigs on windows in CI * more stable tests * test flakiness * refresh cert * flaky test fix * simplify test servers * possibly? * redo certs for 127.0.0.1 * update command * Remove pytz and adjust sleep * update rigs * update missing timestap * more resilient * sleepier * Tweaks * changelog * Add explicit filename tests * add warning on partial move
1 parent 863e884 commit b21cac1

24 files changed

+1395
-154
lines changed

HISTORY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
## v0.21.1 (2025-05-14)
44

55
- Fixed `rmtree` fail on Azure with no `hns` and more than 256 blobs to drop (Issue [#509](https://github.com/drivendataorg/cloudpathlib/issues/509), PR [#508](https://github.com/drivendataorg/cloudpathlib/pull/508), thanks @alikefia)
6+
- Added support for http(s) urls with `HttpClient`, `HttpPath`, `HttpsClient`, and `HttpsPath`. (Issue [#455](https://github.com/drivendataorg/cloudpathlib/issues/455 ), PR [#468](https://github.com/drivendataorg/cloudpathlib/pull/468))
67

78
## v0.21.0 (2025-03-03)
89

README.md

Lines changed: 91 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -124,88 +124,97 @@ list(root_dir.glob('**/*.txt'))
124124

125125
Most methods and properties from `pathlib.Path` are supported except for the ones that don't make sense in a cloud context. There are a few additional methods or properties that relate to specific cloud services or specifically for cloud paths.
126126

127-
| Methods + properties | `AzureBlobPath` | `S3Path` | `GSPath` |
128-
|:-----------------------|:------------------|:-----------|:-----------|
129-
| `absolute` ||||
130-
| `anchor` ||||
131-
| `as_uri` ||||
132-
| `drive` ||||
133-
| `exists` ||||
134-
| `glob` ||||
135-
| `is_absolute` ||||
136-
| `is_dir` ||||
137-
| `is_file` ||||
138-
| `is_relative_to` ||||
139-
| `iterdir` ||||
140-
| `joinpath` ||||
141-
| `match` ||||
142-
| `mkdir` ||||
143-
| `name` ||||
144-
| `open` ||||
145-
| `parent` ||||
146-
| `parents` ||||
147-
| `parts` ||||
148-
| `read_bytes` ||||
149-
| `read_text` ||||
150-
| `relative_to` ||||
151-
| `rename` ||||
152-
| `replace` ||||
153-
| `resolve` ||||
154-
| `rglob` ||||
155-
| `rmdir` ||||
156-
| `samefile` ||||
157-
| `stat` ||||
158-
| `stem` ||||
159-
| `suffix` ||||
160-
| `suffixes` ||||
161-
| `touch` ||||
162-
| `unlink` ||||
163-
| `with_name` ||||
164-
| `with_stem` ||||
165-
| `with_suffix` ||||
166-
| `write_bytes` ||||
167-
| `write_text` ||||
168-
| `as_posix` ||||
169-
| `chmod` ||||
170-
| `cwd` ||||
171-
| `expanduser` ||||
172-
| `group` ||||
173-
| `hardlink_to` ||||
174-
| `home` ||||
175-
| `is_block_device` ||||
176-
| `is_char_device` ||||
177-
| `is_fifo` ||||
178-
| `is_mount` ||||
179-
| `is_reserved` ||||
180-
| `is_socket` ||||
181-
| `is_symlink` ||||
182-
| `lchmod` ||||
183-
| `link_to` ||||
184-
| `lstat` ||||
185-
| `owner` ||||
186-
| `readlink` ||||
187-
| `root` ||||
188-
| `symlink_to` ||||
189-
| `as_url` ||||
190-
| `clear_cache` ||||
191-
| `cloud_prefix` ||||
192-
| `copy` ||||
193-
| `copytree` ||||
194-
| `download_to` ||||
195-
| `etag` ||||
196-
| `fspath` ||||
197-
| `is_junction` ||||
198-
| `is_valid_cloudpath` ||||
199-
| `rmtree` ||||
200-
| `upload_from` ||||
201-
| `validate` ||||
202-
| `walk` ||||
203-
| `with_segments` ||||
204-
| `blob` ||||
205-
| `bucket` ||||
206-
| `container` ||||
207-
| `key` ||||
208-
| `md5` ||||
127+
| Methods + properties | `AzureBlobPath` | `GSPath` | `HttpsPath` | `S3Path` |
128+
|:-----------------------|:------------------|:-----------|:--------------|:-----------|
129+
| `absolute` |||||
130+
| `anchor` |||||
131+
| `as_uri` |||||
132+
| `drive` |||||
133+
| `exists` |||||
134+
| `glob` |||||
135+
| `is_absolute` |||||
136+
| `is_dir` |||||
137+
| `is_file` |||||
138+
| `is_junction` |||||
139+
| `is_relative_to` |||||
140+
| `iterdir` |||||
141+
| `joinpath` |||||
142+
| `match` |||||
143+
| `mkdir` |||||
144+
| `name` |||||
145+
| `open` |||||
146+
| `parent` |||||
147+
| `parents` |||||
148+
| `parts` |||||
149+
| `read_bytes` |||||
150+
| `read_text` |||||
151+
| `relative_to` |||||
152+
| `rename` |||||
153+
| `replace` |||||
154+
| `resolve` |||||
155+
| `rglob` |||||
156+
| `rmdir` |||||
157+
| `samefile` |||||
158+
| `stat` |||||
159+
| `stem` |||||
160+
| `suffix` |||||
161+
| `suffixes` |||||
162+
| `touch` |||||
163+
| `unlink` |||||
164+
| `walk` |||||
165+
| `with_name` |||||
166+
| `with_segments` |||||
167+
| `with_stem` |||||
168+
| `with_suffix` |||||
169+
| `write_bytes` |||||
170+
| `write_text` |||||
171+
| `as_posix` |||||
172+
| `chmod` |||||
173+
| `cwd` |||||
174+
| `expanduser` |||||
175+
| `group` |||||
176+
| `hardlink_to` |||||
177+
| `home` |||||
178+
| `is_block_device` |||||
179+
| `is_char_device` |||||
180+
| `is_fifo` |||||
181+
| `is_mount` |||||
182+
| `is_reserved` |||||
183+
| `is_socket` |||||
184+
| `is_symlink` |||||
185+
| `lchmod` |||||
186+
| `lstat` |||||
187+
| `owner` |||||
188+
| `readlink` |||||
189+
| `root` |||||
190+
| `symlink_to` |||||
191+
| `as_url` |||||
192+
| `clear_cache` |||||
193+
| `client` |||||
194+
| `cloud_prefix` |||||
195+
| `copy` |||||
196+
| `copytree` |||||
197+
| `download_to` |||||
198+
| `from_uri` |||||
199+
| `fspath` |||||
200+
| `full_match` |||||
201+
| `is_valid_cloudpath` |||||
202+
| `parser` |||||
203+
| `rmtree` |||||
204+
| `upload_from` |||||
205+
| `validate` |||||
206+
| `etag` |||||
207+
| `blob` |||||
208+
| `bucket` |||||
209+
| `md5` |||||
210+
| `container` |||||
211+
| `delete` |||||
212+
| `get` |||||
213+
| `head` |||||
214+
| `key` |||||
215+
| `parsed_url` |||||
216+
| `post` |||||
217+
| `put` |||||
209218

210219
----
211220

cloudpathlib/__init__.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@
44
from .azure.azblobclient import AzureBlobClient
55
from .azure.azblobpath import AzureBlobPath
66
from .cloudpath import CloudPath, implementation_registry
7-
from .s3.s3client import S3Client
8-
from .gs.gspath import GSPath
97
from .gs.gsclient import GSClient
8+
from .gs.gspath import GSPath
9+
from .http.httpclient import HttpClient, HttpsClient
10+
from .http.httppath import HttpPath, HttpsPath
11+
from .s3.s3client import S3Client
1012
from .s3.s3path import S3Path
1113

1214

@@ -27,6 +29,10 @@
2729
"implementation_registry",
2830
"GSClient",
2931
"GSPath",
32+
"HttpClient",
33+
"HttpsClient",
34+
"HttpPath",
35+
"HttpsPath",
3036
"S3Client",
3137
"S3Path",
3238
]

cloudpathlib/cloudpath.py

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@
2727
Generator,
2828
List,
2929
Optional,
30-
Sequence,
3130
Tuple,
3231
Type,
3332
TYPE_CHECKING,
@@ -299,11 +298,11 @@ def __setstate__(self, state: Dict[str, Any]) -> None:
299298

300299
@property
301300
def _no_prefix(self) -> str:
302-
return self._str[len(self.cloud_prefix) :]
301+
return self._str[len(self.anchor) :]
303302

304303
@property
305304
def _no_prefix_no_drive(self) -> str:
306-
return self._str[len(self.cloud_prefix) + len(self.drive) :]
305+
return self._str[len(self.anchor) + len(self.drive) :]
307306

308307
@overload
309308
@classmethod
@@ -909,9 +908,9 @@ def relative_to(self, other: Self, walk_up: bool = False) -> PurePosixPath:
909908
# absolute)
910909
if not isinstance(other, CloudPath):
911910
raise ValueError(f"{self} is a cloud path, but {other} is not")
912-
if self.cloud_prefix != other.cloud_prefix:
911+
if self.anchor != other.anchor:
913912
raise ValueError(
914-
f"{self} is a {self.cloud_prefix} path, but {other} is a {other.cloud_prefix} path"
913+
f"{self} is a {self.anchor} path, but {other} is a {other.anchor} path"
915914
)
916915

917916
kwargs = dict(walk_up=walk_up)
@@ -939,6 +938,9 @@ def full_match(self, pattern: str, case_sensitive: Optional[bool] = None) -> boo
939938
# strip scheme from start of pattern before testing
940939
if pattern.startswith(self.anchor + self.drive):
941940
pattern = pattern[len(self.anchor + self.drive) :]
941+
elif pattern.startswith(self.anchor):
942+
# for http paths, keep leading slash
943+
pattern = pattern[len(self.anchor) - 1 :]
942944

943945
# remove drive, which is kept on normal dispatch to pathlib
944946
return PurePosixPath(self._no_prefix_no_drive).full_match( # type: ignore[attr-defined]
@@ -969,7 +971,7 @@ def parent(self) -> Self:
969971
return self._dispatch_to_path("parent")
970972

971973
@property
972-
def parents(self) -> Sequence[Self]:
974+
def parents(self) -> Tuple[Self, ...]:
973975
return self._dispatch_to_path("parents")
974976

975977
@property
@@ -1224,7 +1226,7 @@ def copytree(self, destination, force_overwrite_to_cloud=None, ignore=None):
12241226
)
12251227
elif subpath.is_dir():
12261228
subpath.copytree(
1227-
destination / subpath.name,
1229+
destination / (subpath.name + ("" if subpath.name.endswith("/") else "/")),
12281230
force_overwrite_to_cloud=force_overwrite_to_cloud,
12291231
ignore=ignore,
12301232
)
@@ -1258,8 +1260,8 @@ def _new_cloudpath(self, path: Union[str, os.PathLike]) -> Self:
12581260
path = path[1:]
12591261

12601262
# add prefix/anchor if it is not already
1261-
if not path.startswith(self.cloud_prefix):
1262-
path = f"{self.cloud_prefix}{path}"
1263+
if not path.startswith(self.anchor):
1264+
path = f"{self.anchor}{path}"
12631265

12641266
return self.client.CloudPath(path)
12651267

cloudpathlib/http/__init__.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
from .httpclient import HttpClient, HttpsClient
2+
from .httppath import HttpPath, HttpsPath
3+
4+
__all__ = [
5+
"HttpClient",
6+
"HttpPath",
7+
"HttpsClient",
8+
"HttpsPath",
9+
]

0 commit comments

Comments
 (0)