Skip to content

Commit d479934

Browse files
formated and documented
1 parent 483a732 commit d479934

File tree

5 files changed

+77
-22
lines changed

5 files changed

+77
-22
lines changed

databricks/sdk/__init__.py

Lines changed: 5 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

databricks/sdk/config.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
import pathlib
77
import sys
88
import urllib.parse
9-
from typing import Dict, Iterable, Optional, List
9+
from typing import Dict, Iterable, List, Optional
1010

1111
import requests
1212

@@ -110,7 +110,9 @@ class Config:
110110

111111
disable_async_token_refresh: bool = ConfigAttribute(env="DATABRICKS_DISABLE_ASYNC_TOKEN_REFRESH")
112112

113-
disable_experimental_files_api_client: bool = ConfigAttribute(env="DATABRICKS_DISABLE_EXPERIMENTAL_FILES_API_CLIENT")
113+
disable_experimental_files_api_client: bool = ConfigAttribute(
114+
env="DATABRICKS_DISABLE_EXPERIMENTAL_FILES_API_CLIENT"
115+
)
114116

115117
files_ext_client_download_streaming_chunk_size: int = 2 * 1024 * 1024 # 2 MiB
116118

docs/workspace/files/files.rst

Lines changed: 66 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
==================
33
.. currentmodule:: databricks.sdk.service.files
44

5-
.. py:class:: FilesAPI
5+
.. py:class:: FilesExt
66
77
The Files API is a standard HTTP API that allows you to read, write, list, and delete files and
88
directories by referring to their URI. The API makes working with file content as raw bytes easier and
@@ -61,15 +61,39 @@
6161

6262
.. py:method:: download(file_path: str) -> DownloadResponse
6363
64-
Downloads a file. The file contents are the response body. This is a standard HTTP file download, not
65-
a JSON RPC. It supports the Range and If-Unmodified-Since HTTP headers.
64+
Download a file.
65+
66+
Downloads a file of any size. The file contents are the response body.
67+
This is a standard HTTP file download, not a JSON RPC.
68+
69+
It is strongly recommended, for fault tolerance reasons,
70+
to iteratively consume from the stream with a maximum read(size)
71+
defined instead of using indefinite-size reads.
6672

6773
:param file_path: str
68-
The absolute path of the file.
74+
The remote path of the file, e.g. /Volumes/path/to/your/file
6975

7076
:returns: :class:`DownloadResponse`
7177

7278

79+
.. py:method:: download_to(file_path: str, destination: str [, overwrite: bool = True, use_parallel: bool = False, parallelism: Optional[int]]) -> DownloadFileResult
80+
81+
Download a file to a local path. There would be no responses returned if the download is successful.
82+
83+
:param file_path: str
84+
The remote path of the file, e.g. /Volumes/path/to/your/file
85+
:param destination: str
86+
The local path where the file will be saved.
87+
:param overwrite: bool
88+
If true, an existing file will be overwritten. When not specified, assumed True.
89+
:param use_parallel: bool
90+
If true, the download will be performed using multiple threads.
91+
:param parallelism: int
92+
The number of parallel threads to use for downloading. If not specified, defaults to the number of CPU cores.
93+
94+
:returns: :class:`DownloadFileResult`
95+
96+
7397
.. py:method:: get_directory_metadata(directory_path: str)
7498
7599
Get the metadata of a directory. The response HTTP headers contain the metadata. There is no response
@@ -124,19 +148,48 @@
124148
:returns: Iterator over :class:`DirectoryEntry`
125149

126150

127-
.. py:method:: upload(file_path: str, contents: BinaryIO [, overwrite: Optional[bool]])
151+
.. py:method:: upload(file_path: str, content: BinaryIO [, overwrite: Optional[bool], part_size: Optional[int], use_parallel: bool = True, parallelism: Optional[int]]) -> UploadStreamResult
128152
129-
Uploads a file of up to 5 GiB. The file contents should be sent as the request body as raw bytes (an
130-
octet stream); do not encode or otherwise modify the bytes before sending. The contents of the
131-
resulting file will be exactly the bytes sent in the request body. If the request is successful, there
132-
is no response body.
153+
154+
Upload a file with stream interface.
133155

134156
:param file_path: str
135-
The absolute path of the file.
136-
:param contents: BinaryIO
157+
The absolute remote path of the target file, e.g. /Volumes/path/to/your/file
158+
:param content: BinaryIO
159+
The contents of the file to upload. This must be a BinaryIO stream.
137160
:param overwrite: bool (optional)
138-
If true or unspecified, an existing file will be overwritten. If false, an error will be returned if
139-
the path points to an existing file.
161+
If true, an existing file will be overwritten. When not specified, assumed True.
162+
:param part_size: int (optional)
163+
If set, multipart upload will use the value as its size per uploading part.
164+
:param use_parallel: bool (optional)
165+
If true, the upload will be performed using multiple threads. Be aware that this will consume more memory
166+
because multiple parts will be buffered in memory before being uploaded. The amount of memory used is proportional
167+
to `parallelism * part_size`.
168+
If false, the upload will be performed in a single thread.
169+
Default is True.
170+
:param parallelism: int (optional)
171+
The number of threads to use for parallel uploads. This is only used if `use_parallel` is True.
172+
173+
:returns: :class:`UploadStreamResult`
174+
140175

176+
.. py:method:: upload_from(file_path: str, source_path: str [, overwrite: Optional[bool], part_size: Optional[int], use_parallel: bool = True, parallelism: Optional[int]]) -> UploadFileResult
141177
178+
Upload a file directly from a local path.
179+
180+
:param file_path: str
181+
The absolute remote path of the target file.
182+
:param source_path: str
183+
The local path of the file to upload. This must be a path to a local file.
184+
:param part_size: int
185+
The size of each part in bytes for multipart upload. This is a required parameter for multipart uploads.
186+
:param overwrite: bool (optional)
187+
If true, an existing file will be overwritten. When not specified, assumed True.
188+
:param use_parallel: bool (optional)
189+
If true, the upload will be performed using multiple threads. Default is True.
190+
:param parallelism: int (optional)
191+
The number of threads to use for parallel uploads. This is only used if `use_parallel` is True.
192+
If not specified, the default parallelism will be set to config.multipart_upload_default_parallelism
193+
194+
:returns: :class:`UploadFileResult`
142195

tests/test_files.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
from enum import Enum
1414
from tempfile import NamedTemporaryFile
1515
from threading import Lock
16-
from typing import Any, Callable, List, Optional, Type, Union, Dict
16+
from typing import Any, Callable, Dict, List, Optional, Type, Union
1717
from urllib.parse import parse_qs, urlparse
1818

1919
import pytest

tests/test_files_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import os
33
from abc import ABC, abstractmethod
44
from io import BytesIO, RawIOBase, UnsupportedOperation
5-
from typing import BinaryIO, Callable, Optional, Tuple, List
5+
from typing import BinaryIO, Callable, List, Optional, Tuple
66

77
import pytest
88

0 commit comments

Comments
 (0)