You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## What changes are proposed in this pull request?
**WHAT**
- Update the documentation of `FilesExt` interfaces
- `upload`
- `upload_from`
- `download`
- `download_to`
**WHY**
Rewording some of the documentations to avoid confusion and increase
clarity for users.
## How is this tested?
N/A
NO_CHANGELOG=true
Copy file name to clipboardExpand all lines: databricks/sdk/mixins/files.py
+25-15Lines changed: 25 additions & 15 deletions
Original file line number
Diff line number
Diff line change
@@ -784,12 +784,11 @@ def download(
784
784
) ->DownloadResponse:
785
785
"""Download a file.
786
786
787
-
Downloads a file of any size. The file contents are the response body.
788
-
This is a standard HTTP file download, not a JSON RPC.
787
+
Downloads a file as a stream into memory.
789
788
790
-
It is strongly recommended, for fault tolerance reasons,
791
-
to iteratively consume from the stream with a maximum read(size)
792
-
defined instead of using indefinite-size reads.
789
+
Use this when you want to process the downloaded file in memory or pipe it into another system. Supports files of any size in SDK v0.72.0+. Earlier versions have a 5 GB file size limit.
790
+
791
+
If the download is successful, the function returns the downloaded file result. If the download is unsuccessful, the function raises an exception.
793
792
794
793
:param file_path: str
795
794
The remote path of the file, e.g. /Volumes/path/to/your/file
@@ -817,14 +816,18 @@ def download_to(
817
816
use_parallel: bool=False,
818
817
parallelism: Optional[int] =None,
819
818
) ->DownloadFileResult:
820
-
"""Download a file to a local path. There would be no responses returned if the download is successful.
819
+
"""Downloads a file directly to a local file path.
820
+
821
+
Use this when you want to write the file straight to disk instead of holding it in memory. Supports files of any size in SDK v0.72.0+. Earlier versions have a 5 GB file size limit.
822
+
823
+
Supports parallel download (use_parallel=True), which may improve performance for large files. This is available on all operating systems except Windows.
821
824
822
825
:param file_path: str
823
826
The remote path of the file, e.g. /Volumes/path/to/your/file
824
827
:param destination: str
825
828
The local path where the file will be saved.
826
829
:param overwrite: bool
827
-
If true, an existing file will be overwritten. When not specified, assumed True.
830
+
If true, an existing file will be overwritten. When not specified, defaults to True.
828
831
:param use_parallel: bool
829
832
If true, the download will be performed using multiple threads.
830
833
:param parallelism: int
@@ -1078,18 +1081,22 @@ def upload(
1078
1081
parallelism: Optional[int] =None,
1079
1082
) ->UploadStreamResult:
1080
1083
"""
1081
-
Upload a file with stream interface.
1084
+
Uploads a file from memory or a stream interface.
1085
+
1086
+
Use this when you want to upload data already in memory or piped from another system. Supports files of any size in SDK v0.72.0+. Earlier versions have a 5 GB file size limit.
1087
+
1088
+
Limitations: If the storage account is on Azure and has firewall enabled, the maximum file size is 5GB.
1082
1089
1083
1090
:param file_path: str
1084
1091
The absolute remote path of the target file, e.g. /Volumes/path/to/your/file
1085
1092
:param contents: BinaryIO
1086
1093
The contents of the file to upload. This must be a BinaryIO stream.
1087
1094
:param overwrite: bool (optional)
1088
-
If true, an existing file will be overwritten. When not specified, assumed True.
1095
+
If true, an existing file will be overwritten. When not specified, defaults to True.
1089
1096
:param part_size: int (optional)
1090
-
If set, multipart upload will use the value as its size per uploading part.
1097
+
If set, multipart upload will use the value as its size per uploading part. If not set, an appropriate value will be automatically used.
1091
1098
:param use_parallel: bool (optional)
1092
-
If true, the upload will be performed using multiple threads. Be aware that this will consume more memory
1099
+
If true, the upload will be performed using multiple threads. Note that this will consume more memory
1093
1100
because multiple parts will be buffered in memory before being uploaded. The amount of memory used is proportional
1094
1101
to `parallelism * part_size`.
1095
1102
If false, the upload will be performed in a single thread.
@@ -1166,16 +1173,19 @@ def upload_from(
1166
1173
use_parallel: bool=True,
1167
1174
parallelism: Optional[int] =None,
1168
1175
) ->UploadFileResult:
1169
-
"""Upload a file directly from a local path.
1176
+
"""
1177
+
Uploads a file from a local file path.
1178
+
1179
+
Use this when your data already exists on disk and you want to upload it directly without manually opening it yourself. Supports files of any size in SDK v0.72.0+. Earlier versions have a 5 GB file size limit.
1170
1180
1171
1181
:param file_path: str
1172
1182
The absolute remote path of the target file.
1173
1183
:param source_path: str
1174
1184
The local path of the file to upload. This must be a path to a local file.
1175
-
:param part_size: int
1176
-
The size of each part in bytes for multipart upload. This is a required parameter for multipart uploads.
1185
+
:param part_size: int (optional)
1186
+
If set, multipart upload will use the value as its size per uploading part. If not set, an appropriate default value will be automatically used.
1177
1187
:param overwrite: bool (optional)
1178
-
If true, an existing file will be overwritten. When not specified, assumed True.
1188
+
If true, an existing file will be overwritten. When not specified, defaults True.
1179
1189
:param use_parallel: bool (optional)
1180
1190
If true, the upload will be performed using multiple threads. Default is True.
0 commit comments