You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/workers-ai/guides/tutorials/using-bigquery-with-workers-ai.mdx
+24-24Lines changed: 24 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,9 +16,9 @@ sidebar:
16
16
17
17
import { WranglerConfig } from"~/components";
18
18
19
-
The easiest way to get started with [Workers AI](/workers-ai/) is to try it out in the [Multi-modal Playground](https://multi-modal.ai.cloudflare.com/) and the [LLM playground](https://playground.ai.cloudflare.com/). If you decide that you want to integrate your code with Workers AI, you may then decide to then use its [REST API endpoints](/workers-ai/get-started/rest-api/) or via a [Worker binding](/workers-ai/configuration/bindings/).
19
+
The easiest way to get started with [Workers AI](/workers-ai/) is to try it out in the [Multi-modal Playground](https://multi-modal.ai.cloudflare.com/) and the [LLM playground](https://playground.ai.cloudflare.com/). If you decide that you want to integrate your code with Workers AI, you may then decide to use its [REST API endpoints](/workers-ai/get-started/rest-api/) or a [Worker binding](/workers-ai/configuration/bindings/).
20
20
21
-
But, what about the data? What if what you want these models to ingest data that is stored outside Cloudflare?
21
+
But what about the data? What if you want these models to ingest data that is stored outside Cloudflare?
22
22
23
23
In this tutorial, you will learn how to bring data from Google BigQuery to a Cloudflare Worker so that it can be used as input for Workers AI models.
24
24
@@ -32,7 +32,7 @@ You will need:
32
32
33
33
## 1. Set up your Cloudflare Worker
34
34
35
-
To ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you have not created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/).
35
+
To ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you have not created one yet, please review our [tutorial on how to get started](/workers/get-started/).
36
36
37
37
After following the steps to create a Worker, you should have the following code in your new Worker project:
38
38
@@ -58,7 +58,7 @@ You should be seeing `Hello World!` in your browser:
58
58
Hello World!
59
59
```
60
60
61
-
If you are running into any issues during this step, please make sure to review [Worker's Get Started Guide](/workers/get-started/guide/).
61
+
If you run into any issues during this step, please review the[Worker's Get Started Guide](/workers/get-started/guide/).
62
62
63
63
## 2. Import GCP Service key into the Worker as Secrets
64
64
@@ -82,9 +82,9 @@ Your downloaded key JSON file from Google Cloud Platform should have the followi
82
82
}
83
83
```
84
84
85
-
For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id`, and `project_id`.
85
+
For this tutorial, you will only need the values of the following fields: `client_email`, `private_key`, `private_key_id`, and `project_id`.
86
86
87
-
Instead of storing this information in plain text in the Worker, you will use [secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself.
87
+
Instead of storing this information in plain text in the Worker, you will use [Secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself.
@@ -96,7 +96,7 @@ You will be asked to enter a secret value, which will be the value of the field
96
96
97
97
:::note
98
98
99
-
Do not include any double quotes in the secret that you store, as the Secret will be already interpreted as a string.
99
+
Do not include any double quotes in the secret that you store, as it will already be interpreted as a string.
100
100
101
101
:::
102
102
@@ -120,7 +120,7 @@ npx wrangler secret put BQ_PRIVATE_KEY_ID
120
120
npx wrangler secret put BQ_PROJECT_ID
121
121
```
122
122
123
-
At this point, you have successfully imported three fields from the JSON key file downloaded from Google Cloud Platform into Cloudflare secrets to be used in a Worker.
123
+
At this point, you have successfully imported three fields from the JSON key file downloaded from Google Cloud Platform into Cloudflare Secrets to be used in a Worker.
124
124
125
125
[Secrets](/workers/configuration/secrets/) are only made available to Workers once they are deployed. To make them available during development, [create a `.dev.vars`](/workers/configuration/secrets/#local-development-with-secrets) file to locally store these credentials and reference them as environment variables.
Make sure to include `.dev.vars`to your `.gitignore` file in your project to prevent getting your credentials uploaded to a repository if you are using a version control system.
136
+
Make sure to include `.dev.vars`in your project `.gitignore` file to prevent your credentials being uploaded to a repository when using version control.
137
137
138
-
Check that secrets are loaded correctly in `src/index.js` by logging their values into a console output:
138
+
Check the secrets are loaded correctly in `src/index.js` by logging their values into a console output, as follows:
139
139
140
140
```javascript
141
141
exportdefault {
@@ -176,7 +176,7 @@ For this tutorial, you will be using the [jose](https://www.npmjs.com/package/jo
176
176
npm i jose
177
177
```
178
178
179
-
To verify that the installation succeeded, you can run `npm list`, which lists all the installed packages and see if the `jose` dependency has been added:
179
+
To verify that the installation succeeded, you can run `npm list`, which lists all the installed packages, to check if the `jose` dependency has been added:
180
180
181
181
```sh
182
182
<project_name>@0.0.0
@@ -187,9 +187,9 @@ To verify that the installation succeeded, you can run `npm list`, which lists a
Now that you have installed the `jose` library, it is time to import it and add a function to your code that generates a signed JWT:
192
+
Now that you have installed the `jose` library, it is time to import it and add a function to your code that generates a signed JSON Web Token (JWT):
193
193
194
194
```javascript
195
195
import*asjosefrom'jose';
@@ -237,7 +237,7 @@ Now that you have created a JWT, it is time to do an API call to BigQuery to fet
237
237
238
238
With the JWT token created in the previous step, issue an API request to BigQuery's API to retrieve data from a table.
239
239
240
-
You will now query the table that you already have created in BigQuery as part of the prerequisites of this tutorial. This example uses a sampled version of the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) that was used under its MIT licence and uploaded to BigQuery.
240
+
You will now query the table that you created in BigQuery earlier in this tutorial. This example uses a sampled version of the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) that was used under its MIT licence and uploaded to BigQuery.
241
241
242
242
```javascript
243
243
constqueryBQ=async (bqJWT, path) => {
@@ -270,11 +270,11 @@ export default {
270
270
};
271
271
```
272
272
273
-
Having the raw row data from BigQuery means that you can now format it in a JSON-like style up next.
273
+
Having the raw row data from BigQuery means that you can now format it in a JSON-like style next.
274
274
275
275
## 6. Format results from the query
276
276
277
-
Now that you have retrieved the data from BigQuery, it is time to note that a BigQuery API response looks something like this:
277
+
Now that you have retrieved the data from BigQuery, your BigQuery API response should look something like this:
278
278
279
279
```json
280
280
{
@@ -330,7 +330,7 @@ Now that you have retrieved the data from BigQuery, it is time to note that a Bi
330
330
}
331
331
```
332
332
333
-
This format may be difficult to read and to work with when iterating through results, which will go on to do later in this tutorial. So you will now implement a function that maps the schema into each individual value, and the resulting output will be easier to read, as shown below. Each row corresponds to an object within an array.
333
+
This format may be difficult to read and work with when iterating through results. So you will now implement a function that maps the schema into each individual value, and the resulting output will be easier to read, as shown below. Each row corresponds to an object within an array.
334
334
335
335
```javascript
336
336
[
@@ -353,10 +353,10 @@ Create a `formatRows` function that takes a number of rows and fields returned f
//Depending on the position of each value, it is known what field you should assign to it.
356
+
//Index to fieldName
357
357
constfieldsByIndex=newMap();
358
358
359
-
// Load all fields name and have their index in the array result as their key
359
+
// Load all fields by name and have their index in the array result as their key
360
360
fields.forEach((field, index) => {
361
361
fieldsByIndex.set(index, field.name)
362
362
})
@@ -500,11 +500,11 @@ Once you access `http://localhost:8787` you should see an output similar to the
500
500
}
501
501
```
502
502
503
-
The actual values and fields will mostly depend on the query made in Step 5 that are then fed into the LLMs models.
503
+
The actual values and fields will mostly depend on the query made in Step 5 that is then fed into the LLM.
504
504
505
505
## Final result
506
506
507
-
All the code shown in the different steps are combined into the following code in`src/index.js`:
507
+
All the code shown in the different steps is combined into the following code in`src/index.js`:
508
508
509
509
```javascript
510
510
import * as jose from "jose";
@@ -677,12 +677,12 @@ This will create a public endpoint that you can use to access the Worker globall
677
677
678
678
In this tutorial, you have learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker secrets. This was later imported in the code, and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery.
679
679
680
-
Once you obtained the results, you formatted them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data.
680
+
Once you obtained the results, you formatted them to pass to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data.
681
681
682
682
## Next Steps
683
683
684
-
If, instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. It allows triggering the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). Consider reviewing the Reference Architecture Diagrams on [Ingesting BigQuery Data into Workers AI](/reference-architecture/diagrams/ai/bigquery-workers-ai/).
684
+
If, instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. This enables you to trigger the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). Consider reviewing the Reference Architecture Diagrams on [Ingesting BigQuery Data into Workers AI](/reference-architecture/diagrams/ai/bigquery-workers-ai/).
685
685
686
-
A use case to ingest data from other sources, like you did in this tutorial, is to create a RAG system. If this sounds relevant to you, please check out the tutorial [Build a Retrieval Augmented Generation (RAG) AI](/workers-ai/guides/tutorials/build-a-retrieval-augmented-generation-ai/).
686
+
A use case to ingest data from other sources, like you did in this tutorial, is to create a RAG system. If this sounds relevant to you, please check out the [Build a Retrieval Augmented Generation (RAG) AI tutorial](/workers-ai/guides/tutorials/build-a-retrieval-augmented-generation-ai/).
687
687
688
688
To learn more about what other AI models you can use at Cloudflare, please visit the [Workers AI](/workers-ai) section of our docs.
0 commit comments