You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The `LocalFile` source imports files from a local file system.
11
11
12
+
### Spec
13
+
12
14
The spec takes the following fields:
13
15
*`path` (type: `str`, required): full path of the root directory to import files from
14
16
*`binary` (type: `bool`, optional): whether reading files as binary (instead of text)
@@ -24,7 +26,45 @@ The spec takes the following fields:
24
26
25
27
:::
26
28
29
+
### Schema
27
30
28
31
The output is a table with the following sub fields:
29
32
*`filename` (key, type: `str`): the filename of the file, including the path, relative to the root directory, e.g. `"dir1/file1.md"`
30
33
*`content` (type: `str` if `binary` is `False`, otherwise `bytes`): the content of the file
34
+
35
+
## GoogleDrive
36
+
37
+
The `GoogleDrive` source imports files from Google Drive.
38
+
39
+
### Setup for Google Drive
40
+
41
+
To access files in Google Drive, the `GoogleDrive` source will need to authenticate by service accounts.
42
+
43
+
1. Register / login in **Google Cloud**.
44
+
2. In [**Google Cloud Console**](https://console.cloud.google.com/), search for *Service Accounts*, to enter the *IAM & Admin / Service Accounts* page.
45
+
-**Create a new service account**: Click *+ Create Service Account*. Follow the instructions to finish service account creation.
46
+
-**Add a key and download the credential**: Under "Actions" for this new service account, click *Manage keys* → *Add key* → *Create new key* → *JSON*.
47
+
Download the key file to a safe place.
48
+
3. In **Google Cloud Console**, search for *Google Drive API*. Enable this API.
49
+
4. In **Google Drive**, share the folders containing files that need to be imported through your source with the service account's email address.
50
+
**Viewer permission** is sufficient.
51
+
- The email address can be found under the *IAM & Admin / Service Accounts* page (in Step 2), in the format of `{service-account-id}@{gcp-project-id}.iam.gserviceaccount.com`.
52
+
- Copy the folder ID. Folder ID can be found from the last part of the folder's URL, e.g. `https://drive.google.com/drive/u/0/folders/{folder-id}` or `https://drive.google.com/drive/folders/{folder-id}?usp=drive_link`.
53
+
54
+
55
+
### Spec
56
+
57
+
The spec takes the following fields:
58
+
59
+
*`service_account_credential_path` (type: `str`, required): full path to the service account credential file in JSON format.
60
+
*`root_folder_ids` (type: `list[str]`, required): a list of Google Drive folder IDs to import files from.
61
+
*`binary` (type: `bool`, optional): whether reading files as binary (instead of text).
62
+
63
+
### Schema
64
+
65
+
The output is a table with the following sub fields:
66
+
67
+
*`file_id` (key, type: `str`): the ID of the file in Google Drive.
68
+
*`filename` (type: `str`): the filename of the file, without the path, e.g. `"file1.md"`
69
+
*`mime_type` (type: `str`): the MIME type of the file.
70
+
*`content` (type: `str` if `binary` is `False`, otherwise `bytes`): the content of the file.
0 commit comments