Skip to content

Commit c56a9fa

Browse files
authored
Add documentation for the GoogleDrive source. (#171)
#108
1 parent 765a524 commit c56a9fa

File tree

1 file changed

+40
-0
lines changed

1 file changed

+40
-0
lines changed

docs/docs/ops/sources.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ description: CocoIndex Built-in Sources
99

1010
The `LocalFile` source imports files from a local file system.
1111

12+
### Spec
13+
1214
The spec takes the following fields:
1315
* `path` (type: `str`, required): full path of the root directory to import files from
1416
* `binary` (type: `bool`, optional): whether reading files as binary (instead of text)
@@ -24,7 +26,45 @@ The spec takes the following fields:
2426

2527
:::
2628

29+
### Schema
2730

2831
The output is a table with the following sub fields:
2932
* `filename` (key, type: `str`): the filename of the file, including the path, relative to the root directory, e.g. `"dir1/file1.md"`
3033
* `content` (type: `str` if `binary` is `False`, otherwise `bytes`): the content of the file
34+
35+
## GoogleDrive
36+
37+
The `GoogleDrive` source imports files from Google Drive.
38+
39+
### Setup for Google Drive
40+
41+
To access files in Google Drive, the `GoogleDrive` source will need to authenticate by service accounts.
42+
43+
1. Register / login in **Google Cloud**.
44+
2. In [**Google Cloud Console**](https://console.cloud.google.com/), search for *Service Accounts*, to enter the *IAM & Admin / Service Accounts* page.
45+
- **Create a new service account**: Click *+ Create Service Account*. Follow the instructions to finish service account creation.
46+
- **Add a key and download the credential**: Under "Actions" for this new service account, click *Manage keys**Add key**Create new key**JSON*.
47+
Download the key file to a safe place.
48+
3. In **Google Cloud Console**, search for *Google Drive API*. Enable this API.
49+
4. In **Google Drive**, share the folders containing files that need to be imported through your source with the service account's email address.
50+
**Viewer permission** is sufficient.
51+
- The email address can be found under the *IAM & Admin / Service Accounts* page (in Step 2), in the format of `{service-account-id}@{gcp-project-id}.iam.gserviceaccount.com`.
52+
- Copy the folder ID. Folder ID can be found from the last part of the folder's URL, e.g. `https://drive.google.com/drive/u/0/folders/{folder-id}` or `https://drive.google.com/drive/folders/{folder-id}?usp=drive_link`.
53+
54+
55+
### Spec
56+
57+
The spec takes the following fields:
58+
59+
* `service_account_credential_path` (type: `str`, required): full path to the service account credential file in JSON format.
60+
* `root_folder_ids` (type: `list[str]`, required): a list of Google Drive folder IDs to import files from.
61+
* `binary` (type: `bool`, optional): whether reading files as binary (instead of text).
62+
63+
### Schema
64+
65+
The output is a table with the following sub fields:
66+
67+
* `file_id` (key, type: `str`): the ID of the file in Google Drive.
68+
* `filename` (type: `str`): the filename of the file, without the path, e.g. `"file1.md"`
69+
* `mime_type` (type: `str`): the MIME type of the file.
70+
* `content` (type: `str` if `binary` is `False`, otherwise `bytes`): the content of the file.

0 commit comments

Comments
 (0)