Skip to content

Commit 6c5d352

Browse files
authored
Google Drive source connector: workflow run trigger (#678)
1 parent c3efd81 commit 6c5d352

File tree

2 files changed

+173
-0
lines changed

2 files changed

+173
-0
lines changed

docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,7 @@
272272
{
273273
"group": "Tool demos",
274274
"pages": [
275+
"examplecode/tools/google-drive-events",
275276
"examplecode/tools/jq",
276277
"examplecode/tools/firecrawl",
277278
"examplecode/tools/langflow",
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
---
2+
title: Google Drive event triggers
3+
---
4+
5+
You can use Google Drive events, such as adding new files to—or updating existing files in—Google Drive shared folders or shared drives, to automatically run Unstructured ETL+ workflows
6+
that rely on those folders or drives as sources. This enables a no-touch approach to having Unstructured automatically process new and updated files in Google Drive as they are added or updated.
7+
8+
This example shows how to automate this process by adding a custom [Google Apps Script](https://developers.google.com/apps-script) project in your Google account. This project runs
9+
a script on a regular time interval. This script automatically checks for new or updated files within the specified Google Drive shared folder or shared drive. If the script
10+
detects at least one new or updated file, it then calls the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to automatically run the
11+
specified corresponding Unstructured ETL+ workflow in your Unstructured account.
12+
13+
<Note>
14+
This example uses a custom Google Apps Script that you create and maintain.
15+
Any issues with file detection, timing, or script execution could be related to your custom script,
16+
rather than with Unstructured. If you are getting unexpected or no results, be sure to check your custom
17+
script's execution logs first for any informational and error messages.
18+
</Note>
19+
20+
## Requirements
21+
22+
import GetStartedSimpleApiOnly from '/snippets/general-shared-text/get-started-simple-api-only.mdx'
23+
24+
To use this example, you will need the following:
25+
26+
- An Unstructured account, and an Unstructured API key for your account, as follows:
27+
28+
<GetStartedSimpleApiOnly />
29+
30+
- The Unstructured Workflow Endpoint URL for your account, as follows:
31+
32+
1. In the Unstructured UI, click **API Keys** on the sidebar.<br/>
33+
2. Note the value of the **Unstructured Workflow Endpoint** field.
34+
35+
- A Google Drive source connector in your Unstructured account. [Learn how](/ui/sources/google-drive).
36+
- Some available [destination connector](/ui/destinations/overview) in your Unstructured account.
37+
- A workflow that uses the preceding source and destination connectors. [Learn how](/ui/workflows).
38+
39+
## Step 1: Create the Google Apps Script project
40+
41+
1. Sign in to your Google account.
42+
2. Go to [http://script.google.com/](http://script.google.com/).
43+
3. Click **+ New project**.
44+
4. Click the new project's default name (such as **Untitled project**), and change it to something more descriptive, such as **Unstructured ETL Scripts**.
45+
46+
## Step 2: Add the script
47+
48+
1. With the project still open, on the sidebar, click the **< >** (**Editor**) icon.
49+
2. In the **Files** tab, click **Code.gs**.
50+
3. Replace the contents of the `Code.gs` file with the following code instead:
51+
52+
```javascript
53+
function checkForNewOrUpdatedFiles() {
54+
const folder = DriveApp.getFolderById(FOLDER_ID);
55+
const files = folder.getFiles();
56+
const now = new Date();
57+
const thresholdMillis = 5 * 60 * 1000; // 5 minutes (adjust as needed).
58+
59+
while (files.hasNext()) {
60+
const file = files.next();
61+
const created = file.getDateCreated();
62+
const lastUpdated = file.getLastUpdated();
63+
const fileName = file.getName();
64+
65+
// Calculate time differences.
66+
const millisSinceCreated = now - created;
67+
const createdWithinThreshold = millisSinceCreated < thresholdMillis;
68+
const millisSinceUpdated = now - lastUpdated;
69+
const updatedWithinThreshold = millisSinceUpdated < thresholdMillis;
70+
71+
// Log file details and calculations.
72+
console.log('File Name: ' + fileName);
73+
console.log('Created: ' + created);
74+
console.log('Last updated: ' + lastUpdated);
75+
console.log('Milliseconds since created: ' + millisSinceCreated);
76+
console.log('Milliseconds since last updated: ' + millisSinceUpdated);
77+
console.log('Created within threshold of ' + thresholdMillis + ' milliseconds? ' + createdWithinThreshold);
78+
console.log('Updated within threshold of ' + thresholdMillis + ' milliseconds? ' + updatedWithinThreshold);
79+
console.log('-----')
80+
81+
// If at least one file was created or updated within the last 5 minutes...
82+
if ((createdWithinThreshold) || (updatedWithinThreshold)) {
83+
// ...then make the HTTP POST request.
84+
UrlFetchApp.fetch(UNSTRUCTURED_API_URL, {
85+
method: 'post',
86+
headers: {
87+
'accept': 'application/json',
88+
'unstructured-api-key': UNSTRUCTURED_API_KEY
89+
}
90+
});
91+
// Then stop the script after the first fetch (no need to check any more files).
92+
console.log('At least one file created or updated within threshold of ' + thresholdMillis + ' milliseconds.')
93+
console.log('Unstructured workflow request sent to ' + UNSTRUCTURED_API_URL)
94+
return;
95+
}
96+
}
97+
console.log('No files created or updated within threshold of ' + thresholdMillis + ' milliseconds. No Unstructured workflow request sent.')
98+
}
99+
```
100+
101+
4. Click the **Save project to Drive** button.
102+
103+
## Step 3: Customize the script for your workflow
104+
105+
1. With the project still open, on the **Files** tab, click the **Add a file** button, and then click **Script**.
106+
2. Name the new file `Constants`. The `.gs` extension is added automatically.
107+
3. Replace the contents of the `Constants.gs` file with the following code instead:
108+
109+
```javascript
110+
const FOLDER_ID = '<folder-id>';
111+
const UNSTRUCTURED_API_URL = '<unstructured-api-url>' + '/workflows/<workflow-id>/run';
112+
const UNSTRUCTURED_API_KEY = '<unstructured-api-key>';
113+
```
114+
115+
Replace the following placeholders:
116+
117+
- Replace `<folder-id>` with the ID of your Google Drive shared folder or shared drive. This is the same ID that you specified
118+
when you created your Google Drive source connector in your Unstructured account.
119+
- Replace `<unstructured-api-url>` with your Unstructured API URL value.
120+
- Replace `<workflow-id>` with the ID of your Unstructured workflow.
121+
- Replace `<unstructured-api-key>` with your Unstructured API key value.
122+
123+
4. Click the disk (**Save project to Drive**) icon.
124+
125+
## Step 4: Create the script trigger
126+
127+
1. With the project still open, on the sidebar, click the alarm clock (**Triggers**) icon.
128+
2. Click the **+ Add Trigger** button.
129+
3. Set the following values:
130+
131+
- For **Choose which function to run**, select `checkForNewOrUpdatedFiles`.
132+
- For **Choose which deployment should run**, select **Head**.
133+
- For **Select event source**, select **Time-driven**.
134+
- For **Select type of time based trigger**, select **Minutes timer**.
135+
- For **Select minute interval**, select **Every 5 minutes**.
136+
137+
<Note>
138+
If you change **Minutes timer** or **Every 5 minutes** to a different interval, you should also go back and change the number `5` in the following
139+
line of code in the `checkForNewOrUpdatedFiles` function. Change the number `5` to the number of minutes that correspond to the alternate interval you
140+
selected:
141+
142+
```javascript
143+
const thresholdMillis = 5 * 60 * 1000;
144+
```
145+
</Note>
146+
147+
- For **Failure notification settings**, select an interval such as immediately, hourly, or daily.
148+
149+
4. Click **Save**.
150+
151+
## Step 5: View trigger results
152+
153+
1. With the project still open, on the sidebar, click the three lines (**Executions**) icon.
154+
2. As soon as the first script execution completes, you should see a corresponding message appear in the **Executions** list. If the **Status** column shows
155+
**Completed**, then keep going with this procedure.
156+
157+
If the **Status** column shows **Failed**, expand the message to
158+
get any details about the failure. Fix the failure, and then wait for the next script execution to complete.
159+
160+
3. When the **Status** column shows **Completed** then, in your Unstructured account, click **Jobs** on the sidebar to see if a new job
161+
is running for that worklow.
162+
163+
If no new job is running for that workflow, then add at least one new file to&mdash;or update at least one existing file in&mdash;the Google Drive shared folder or shared drive,
164+
within 5 minutes of the next script execution. After the next script execution, check the **Jobs** list again.
165+
166+
## Step 6 (Optional): Delete the trigger
167+
168+
1. To stop the script from automatically executing on a regular basis, with the project still open, on the sidebar, click the alarm clock (**Triggers**) icon.
169+
2. Rest your mouse pointer on the trigger you created in Step 4.
170+
3. Click the ellipsis (three dots) icon, and then click **Delete trigger**.
171+
172+

0 commit comments

Comments
 (0)