You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/sources/googledrive.md
+7Lines changed: 7 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,9 +29,16 @@ The spec takes the following fields:
29
29
*`service_account_credential_path` (`str`): full path to the service account credential file in JSON format.
30
30
*`root_folder_ids` (`list[str]`): a list of Google Drive folder IDs to import files from.
31
31
*`binary` (`bool`, optional): whether reading files as binary (instead of text).
32
+
*`included_patterns` (`list[str]`, optional): a list of glob patterns to include files, e.g. `["*.txt", "docs/**/*.md"]`.
33
+
If not specified, all files will be included.
34
+
*`excluded_patterns` (`list[str]`, optional): a list of glob patterns to exclude files, e.g. `["tmp", "**/node_modules"]`.
35
+
Any file or directory matching these patterns will be excluded even if they match `included_patterns`.
36
+
If not specified, no files will be excluded.
32
37
*`recent_changes_poll_interval` (`datetime.timedelta`, optional): when set, this source provides a change capture mechanism by polling Google Drive for recent modified files periodically.
33
38
34
39
:::info
40
+
41
+
`included_patterns` and `excluded_patterns` are using Unix-style glob syntax. See [globset syntax](https://docs.rs/globset/latest/globset/index.html#syntax) for the details.
35
42
36
43
Since it only retrieves metadata for recent modified files (up to the previous poll) during polling,
37
44
it's typically cheaper than a full refresh by setting the [refresh interval](/docs/core/flow_def#refresh-interval) especially when the folder contains a large number of files.
0 commit comments