From ec2b54ed14fb209ab1aa8588736a395dcfc07db4 Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 21:55:16 +0100 Subject: [PATCH 01/36] PCX-13971 - Adding tutorial Using BigQuery with Workers AI --- .../using-bigquery-with-workers-ai.mdx | 669 ++++++++++++++++++ 1 file changed, 669 insertions(+) create mode 100644 src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx new file mode 100644 index 00000000000000..5d6c46b94be0fa --- /dev/null +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -0,0 +1,669 @@ +--- +updated: 2024-10-21 +difficulty: Beginner +content_type: 📝 Tutorial +pcx_content_type: tutorial +title: Using BigQuery with Workers AI +products: + - Workers + - Workers AI +tags: + - AI +languages: + - JavaScript +sidebar: + order: 2 +--- +## TL;DR + +You can skip this tutorial and check its [resulting code](#final-result). + +## Introduction + +The easiest way to get started with [Workers AI](/workers-ai/) is to try it out in the [Multi-modal Playground](https://multi-modal.ai.cloudflare.com/) and the [LLM playground](https://playground.ai.cloudflare.com/). If you decide that you want to integrate your code with Workers AI, you may then decide to then use its [REST API endpoints](/workers-ai/get-started/rest-api/) or via a [Worker binding](/workers-ai/configuration/bindings/). + +But, what about the data? What if what you want these models to ingest data that is stored outside Cloudflare? + +In this tutorial, you will learn how to bring data from Google BigQuery to a Cloudflare Worker so that it can be used as input for Workers AI models. + +## Prerequisites + +You will be needing: +- A [Cloudflare Worker](/workers/) project running a [Hello World script](/workers/get-started/guide/). +- A Google Cloud Platform [service account](https://cloud.google.com/iam/docs/service-accounts-create#iam-service-accounts-create-console) with an [associated key](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) file downloaded that has read access to BigQuery. +- Access to a BigQuery table with some test data that allows you to create a [BigQuery Job Query](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query). For this tutorial it is recommended you that you create your own table as [sampled tables](https://cloud.google.com/bigquery/public-data#sample_tables), unless cloned to your own GCP namespace, won't allow you to run job queries against them. For this example, the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) was used under its MIT licence. + +## Step 1 - Setting up your Cloudflare Worker + +In order to ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you haven’t created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). + +After following the steps to create a Worker, you should have the following code in your new Worker project: + +```javascript +export default { + async fetch(request, env, ctx) { + return new Response('Hello World!'); + }, +}; +``` + +If the Worker project has successfully been created, you should also be able to run `npx wrangler dev` in a console to run the Worker locally: + +```sh +[wrangler:inf] Ready on http://localhost:8787 +``` + +Open a browser tab at `http://localhost:8787/` to see your deployed Worker. Please note that the port `8787` may be a different one in your case. + +You should be seeing ‘Hello World!’ in your browser: + +```sh +Hello World! +``` + +If you are running into any issues during this step, please make sure to review [Worker's Get Started Guide](/workers/get-started/guide/). + +## Step 2 - Import GCP Service key into the Worker as Secrets + +Now that you have verified that the Worker has been created successfully, you will need to reference the Google Cloud Platform service key created in the [Prerequisites](#prerequisites) section of this tutorial. + +Your downloaded key JSON file from Google Cloud Platform should have the following format: + +```json +{ + "type": "service_account", + "project_id": "", + "private_key_id": "", + "private_key": "", + "client_email": "@.iam.gserviceaccount.com", + "client_id": "", + "auth_uri": "https://accounts.google.com/o/oauth2/auth", + "token_uri": "https://oauth2.googleapis.com/token", + "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", + "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/%40.iam.gserviceaccount.com", + "universe_domain": "googleapis.com" +} +``` + +For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id` and `project_id`. + +Instead of storing this information in plain text in the Worker, you will use [Secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself. + +Import those three values from the JSON file into Secrets, starting with the field from the JSON key file called `client_email`, which we will now call `BQ_CLIENT_EMAIL` (you can use another variable name): + +```sh +npx wrangler secret put BQ_CLIENT_EMAIL +``` + +You will be asked to enter a secret value, which will be the value of the field `client_email` in the JSON key file. + +:::note + +Don’t include any double quotes in the Secret that you store, as it will be already interpreted as a string. + +::: + +If the Secret was uploaded successfully, the following message will be displayed: + +```sh +✨ Success! Uploaded secret BQ_CLIENT_EMAIL +``` + +Now import the Secrets for the three remaining fields; `private_key`, `private_key_id` and `project_id` as `BQ_PRIVATE_KEY`, `BQ_PRIVATE_KEY_ID` and `BQ_PROJECT_ID` respectively: + +```sh +npx wrangler secret put BQ_PRIVATE_KEY +``` + +```sh +npx wrangler secret put BQ_PRIVATE_KEY_ID +``` + +```sh +npx wrangler secret put BQ_PROJECT_ID +``` + + +At this point, you have successfully imported three fields from the JSON key file downloaded from Google Cloud Platform into Cloudflare Secrets to be used in a Worker. + +[Secrets](/workers/configuration/secrets/) are only made available to Workers once they are deployed. To make them available during development, [create a `.dev.vars`](/workers/configuration/secrets/#local-development-with-secrets) file to locally store these credentials and reference them as environment variables. + +Your `dev.vars` file should look like the following: +``` +BQ_CLIENT_EMAIL="@.iam.gserviceaccount.com" +BQ_CLIENT_KEY="-----BEGIN PRIVATE KEY----------END PRIVATE KEY-----\n" +BQ_PRIVATE_KEY_ID="" +BQ_PROJECT_ID="" +``` + +Make sure to include `.dev.vars` to your `.gitignore` file in your project to prevent getting your credentials uploaded to a repository if you are using a version control system. + +Check that secrets are loaded correctly in `src/index.js` by logging their values into a console output: + +```javascript +export default { + async fetch(request, env, ctx) { + console.log("BQ_CLIENT_EMAIL: ", env.BQ_CLIENT_EMAIL); + console.log("BQ_PRIVATE_KEY: ", env.BQ_PRIVATE_KEY); + console.log("BQ_PRIVATE_KEY_ID: ", env.BQ_PRIVATE_KEY_ID); + console.log("BQ_PROJECT_ID: ", env.BQ_PROJECT_ID); + return new Response('Hello World!'); + }, +}; +``` + +Restart the Worker and run `npx wrangler dev`. You should see that the server now mentions the newly added variables: + +``` +Using vars defined in .dev.vars +Your worker has access to the following bindings: +- Vars: + - BQ_CLIENT_EMAIL: "(hidden)" + - BQ_PRIVATE_KEY: "(hidden)" + - BQ_PRIVATE_KEY_ID: "(hidden)" + - BQ_PROJECT_ID: "(hidden)" +[wrangler:inf] Ready on http://localhost:8787 +``` + +If you open `http://localhost:8787` in your browser, you should see the values of the variables show up in your console where the `npx wrangler dev` command is running, while still seeing only the `Hello World!` text in the browser window. + +You now have access to the GCP credentials from a Worker. Next, you will install a library to help with the creation of the JSON Web Token needed to interact with GCP’s API. + +## Step 3 - Install library to handle JWT operations + +To interact with BigQuery’s REST API, you will need to generate a [JSON Web Token](https://jwt.io/introduction) to authenticate your requests using the credentials that you’ve loaded into Worker Secrets in the previous step. + +For this tutorial, you will be using the [jose](https://www.npmjs.com/package/jose?activeTab=readme) library for JWT-related operations. Install it by running the following command in a console: + +```sh +npm i jose +``` + +To verify that the installation succeeded, you can run `npm list`, which list list all the installed packages and see if the `jose` dependency has been added: + +```sh +@0.0.0 +// +├── @cloudflare/vitest-pool-workers@0.4.29 +├── jose@5.9.2 +├── vitest@1.5.0 +└── wrangler@3.75.0 +``` + +## Step 4 - Generate JSON Web Token + +Now that you have installed the `jose` library, it’s time to import it and add a function to your code that generates a signed JWT: + +```javascript +import * as jose from 'jose'; +... +const generateBQJWT = async (aCryptoKey, env) => { +const algorithm = "RS256"; +const audience = "https://bigquery.googleapis.com/"; +const expiryAt = (new Date().valueOf() / 1000); + const privateKey = await jose.importPKCS8(env.BQ_PRIVATE_KEY, algorithm); + + // Generate signed JSON Web Token (JWT) + return new jose.SignJWT() + .setProtectedHeader({ + typ: 'JWT', + alg: algorithm, + kid: env.BQ_PRIVATE_KEY_ID + }) + .setIssuer(env.BQ_CLIENT_EMAIL) + .setSubject(env.BQ_CLIENT_EMAIL) + .setAudience(audience) + .setExpirationTime(expiryAt) + .setIssuedAt() + .sign(privateKey) +} + +export default { + async fetch(request, env, ctx) { + ... +// Create JWT to authenticate the BigQuery API call + let bqJWT; + try { + bqJWT = await generateBQJWT(env); + } catch (e) { + return new Response('An error has ocurred while generating the JWT', { status: 500 }) + } + }, + ... +}; + +``` + +Now that you have created a JWT, it’s time to do an API call to BigQuery to fetch some data. + +## Step 5 - Making authenticated requests to Google BigQuery + +With the JWT token created in the previous step, issue an API request to BigQuery’s API to retrieve data from a table. + +You will now query the table that you already have created in BigQuery as part of the prerequisites of this tutorial. This example uses a sampled version of the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) that was used under its MIT licence and uploaded to BigQuery. + +```javascript +const queryBQ = async (bqJWT, path) => { + const bqEndpoint = `https://bigquery.googleapis.com${path}` + // In this example, text is a field in the BigQuery table that is being queried (hn.news_sampled) + const query = 'SELECT text FROM hn.news_sampled LIMIT 3'; + const response = await fetch(bqEndpoint, { + method: "POST", + body: JSON.stringify({ + "query": query + }), + headers: { + Authorization: `Bearer ${bqJWT}` + } + }) + return response.json() +} +... +export default { + async fetch(request, env, ctx) { + ... + let ticketInfo; + try { + ticketInfo = await queryBQ(bqJWT); + } catch (e) { + return new Response('An error has occurred while querying BQ', { status: 500 }); + } + ... + }, +}; +``` + +Having the raw row data from BigQuery means that you can now format it in a JSON-like style up next. + +## Step 6 - Formatting results from the query + +Now that you have retrieved the data from BigQuery, it’s time to note that a BigQuery API response looks something like this: + +```json +{ + ... + "schema": { + "fields": [ + { + "name": "title", + "type": "STRING", + "mode": "NULLABLE" + }, + { + "name": "text", + "type": "STRING", + "mode": "NULLABLE" + } + ] + }, + ... + "rows": [ + { + "f": [ + { + "v": "" + }, + { + "v": "" + } + ] + }, + { + "f": [ + { + "v": "" + }, + { + "v": "" + } + ] + }, + { + "f": [ + { + "v": "" + }, + { + "v": "" + } + ] + } + ], + ... +} +``` + +This format may be a bit harder to read and to work with iterating through results later on in this tutorial. You will now implement a function that maps the schema into each individual value, so the results look as the following instead, each row corresponding to an object within an array: + +```javascript +[ + { + title: "", + text: "" + }, + { + title: "", + text: "" + }, + { + title: "", + text: "" + } +] +``` + +Create a `formatRows` function that takes a number of rows and fields returned from the BigQuery response body and returns an array of results as objects with named fields. + +```javascript +const formatRows = (rowsWithoutFieldNames, fields) => { + // Depending on the position of each value, it is known what field you should assign to it. + const fieldsByIndex = new Map(); + + // Load all fields name and have their index in the array result as their key + fields.forEach((field, index) => { + fieldsByIndex.set(index, field.name) + }) + + // Iterate through rows + const rowsWithFieldNames = rowsWithoutFieldNames.map(row => { + // Per each row represented by an array f, iterate through the unnamed values and find their field names by searching them in the fieldsByIndex. + let newRow = {} + row.f.forEach((field, index) => { + const fieldName = fieldsByIndex.get(index); + if (fieldName) { + // For every field in a row, add them to newRow + newRow = ({ ...newRow, [fieldName]: field.v }); + } + }) + return newRow + }) + + return rowsWithFieldNames +} + +export default { + async fetch(request, env, ctx) { + ... + // Transform output format into array of objects with named fields + let formattedResults; + + if ('rows' in ticketInfo) { + formattedResults = formatRows(ticketInfo.rows, ticketInfo.schema.fields); + console.log(formattedResults) + } else if ('error' in ticketInfo) { + return new Response(ticketInfo.error.message, { status: 500 }) + } + ... + }, +}; +``` + +## Step 7 - Feeding data into Workers AI + +Now that you have converted the response from the BigQuery API into an array of results, generate some tags and attach an associated sentiment score using an LLM via [Workers AI](/workers-ai/): + +```javascript +const generateTags = (data, env) => { + return env.AI.run("@cf/meta/llama-3.1-8b-instruct", { + prompt: `Create three one-word tags for the following text. return only these three tags separated by a comma. don't return text that is not a category.Lowercase only. ${JSON.stringify(data)}`, + }); +} + +const generateSentimentScore = (data, env) => { + return env.AI.run("@cf/meta/llama-3.1-8b-instruct", { + prompt: `return a float number between 0 and 1 measuring the sentiment of the following text. 0 being negative and 1 positive. return only the number, no text. ${JSON.stringify(data)}`, + }); +} + +// Iterates through values, sends them to an AI handler and encapsulates all responses into a single Promise +const getAIGeneratedContent = (data, env, aiHandler) => { + let results = data?.map(dataPoint => { + return aiHandler(dataPoint, env) + }) + return Promise.all(results) +} +... +export default { + async fetch(request, env, ctx) { + ... +let summaries, sentimentScores; + try { + summaries = await getAIGeneratedContent(formattedResults, env, generateTags); + sentimentScores = await getAIGeneratedContent(formattedResults, env, generateSentimentScore) + } catch { + return new Response('There was an error while generating the text summaries or sentiment scores') + } +}, + +formattedResults = formattedResults?.map((formattedResult, i) => { + if (sentimentScores[i].response && summaries[i].response) { + return { + ...formattedResult, + 'sentiment': parseFloat(sentimentScores[i].response).toFixed(2), + 'tags': summaries[i].response.split(',').map((result) => result.trim()) + } + } + } +}; + +``` + +Uncomment the following lines from the `wrangler.toml` file in your project: + +```toml +[ai] +binding = "AI" +``` + +Restart the Worker that is running locally and after doing so, go to your application endpoint: + +```sh +curl http://localhost:8787 +``` + +It is likely that you will be asked to log in into your Cloudflare account and grant temporary access to Wrangler (the Cloudflare CLI) to use your account when using Worker AI. + +Once you access `http://localhost:8787` you should see a similar output as the following one: + +```sh +{ + "data": [ + { + "text": "You can see a clear spike in submissions right around US Thanksgiving.", + "sentiment": "0.61", + "tags": [ + "trends", + "submissions", + "thanksgiving" + ] + }, + { + "text": "I didn't test the changes before I published them. I basically did development on the running server. In fact for about 30 seconds the comments page was broken due to a bug.", + "sentiment": "0.35", + "tags": [ + "software", + "deployment", + "error" + ] + }, + { + "text": "I second that. As I recall, it's a very enjoyable 700-page brain dump by someone who's really into his subject. The writing has a personal voice; there are lots of asides, dry wit, and typos that suggest restrained editing. The discussion is intelligent and often theoretical (and Bartle is not scared to use mathematical metaphors), but the tone is not academic.", + "sentiment": "0.86", + "tags": [ + "review", + "game", + "design" + ] + } + ] +} +``` + +The actual values and fields will mostly depend on the query made in Step 5 that are then feed into the LLMs models. + +## Final result + +All the code shown in the different steps are combined into the following code in `src/index.js`: + +```javascript +import * as jose from 'jose' + +const generateBQJWT = async (env) => { + const algorithm = "RS256" + const audience = "https://bigquery.googleapis.com/" + const expiryAt = (new Date().valueOf() / 1000) + const privateKey = await jose.importPKCS8(env.BQ_PRIVATE_KEY, algorithm) + + // Generate signed JSON Web Token (JWT) + return new jose.SignJWT() + .setProtectedHeader({ + typ: 'JWT', + alg: algorithm, + kid: env.BQ_PRIVATE_KEY_ID + }) + .setIssuer(env.BQ_CLIENT_EMAIL) + .setSubject(env.BQ_CLIENT_EMAIL) + .setAudience(audience) + .setExpirationTime(expiryAt) + .setIssuedAt() + .sign(privateKey) +} + +const queryBQ = async (bgJWT, path) => { + const bqEndpoint = `https://bigquery.googleapis.com${path}` + const query = 'SELECT text FROM hn.news_sampled LIMIT 3'; + const response = await fetch(bqEndpoint, { + method: "POST", + body: JSON.stringify({ + "query": query + }), + headers: { + Authorization: `Bearer ${bgJWT}` + } + }) + return response.json() +} + +const formatRows = (rowsWithoutFieldNames, fields) => { + // Index to fieldName + const fieldsByIndex = new Map() + + fields.forEach((field, index) => { + fieldsByIndex.set(index, field.name) + }) + + const rowsWithFieldNames = rowsWithoutFieldNames.map(row => { + // Map rows into an array of objects with field names + let newRow = {} + row.f.forEach((field, index) => { + const fieldName = fieldsByIndex.get(index) + if (fieldName) { + newRow = ({ ...newRow, [fieldName]: field.v }) + } + }) + return newRow + }) + + return rowsWithFieldNames +} + +const generateTags = (data, env) => { + return env.AI.run("@cf/meta/llama-3.1-8b-instruct", { + prompt: `Create three one-word tags for the following text. return only these three tags separated by a comma. don't return text that is not a category.Lowercase only. ${JSON.stringify(data)}`, + }) +} + +const generateSentimentScore = (data, env) => { + return env.AI.run("@cf/meta/llama-3.1-8b-instruct", { + prompt: `return a float number between 0 and 1 measuring the sentiment of the following text. 0 being negative and 1 positive. return only the number, no text. ${JSON.stringify(data)}`, + }) +} + +const getAIGeneratedContent = (data, env, aiHandler) => { + let results = data?.map(dataPoint => { + return aiHandler(dataPoint, env) + }) + return Promise.all(results) +} + +export default { + async fetch(request, env, ctx) { + + // Create JWT to authenticate the BigQuery API call + let bqJWT + try { + bqJWT = await generateBQJWT(env); + } catch (error) { + console.log(error) + return new Response('An error has ocurred while generating the JWT', { status: 500 }) + } + + // Fetch results from BigQuery + let ticketInfo + try { + ticketInfo = await queryBQ(bqJWT, `/bigquery/v2/projects/${env.BQ_PROJECT_ID}/queries`) + } catch (error) { + console.log(error) + return new Response('An error has occurred while querying BQ', { status: 500 }) + } + + // Transform output format into array of objects with named fields + let formattedResults + if ('rows' in ticketInfo) { + formattedResults = formatRows(ticketInfo.rows, ticketInfo.schema.fields) + } else if ('error' in ticketInfo) { + return new Response(ticketInfo.error.message, { status: 500 }) + } + + // Generate AI summaries and sentiment scores + let summaries, sentimentScores + try { + summaries = await getAIGeneratedContent(formattedResults, env, generateTags) + sentimentScores = await getAIGeneratedContent(formattedResults, env, generateSentimentScore) + } catch { + return new Response('There was an error while generating the text summaries or sentiment scores') + } + + // Add AI summaries and sentiment scores to previous results + formattedResults = formattedResults?.map((formattedResult, i) => { + if (sentimentScores[i].response && summaries[i].response) { + return { + ...formattedResult, + 'sentiment': parseFloat(sentimentScores[i].response).toFixed(2), + 'tags': summaries[i].response.split(',').map((result) => result.trim()) + } + } + }) + + const response = {data: formattedResults} + + return new Response(JSON.stringify(response), { headers: { "Content-Type": "application/json" } }) + }, +}; +``` + +If you wish to deploy this Worker, you can do so by running `npx wrangler deploy`: + +```sh +Total Upload: KiB / gzip: KiB +Uploaded (x sec) +Deployed triggers (x sec) + https:// +Current Version ID: +``` + +This will create a public endpoint that you can use to access the Worker globally. Please keep this in mind when using production data to make sure to include additional access controls in place. + +## Conclusion + +In this tutorial, you’ve learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker Secrets. This was later imported in the code and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. + +Once you obtained the results, some formatting was applied to them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. + +## Next Steps + +If instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. It allows triggering the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). + +A use case to ingest data from other sources, like you did in this tutorial, is to create a RAG system. If this sounds relevant to you, please check out the tutorial [Build a Retrieval Augmented Generation (RAG) AI](/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai/). + +To learn more about what other AI models you can use at Cloudflare, please consider visiting the [Workers AI](/workers-ai) section of our docs. From 57f391257a2736233c515fc394a62dd5ef116dea Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 22:00:26 +0100 Subject: [PATCH 02/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 5d6c46b94be0fa..c187905ef2412b 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -55,7 +55,7 @@ If the Worker project has successfully been created, you should also be able to Open a browser tab at `http://localhost:8787/` to see your deployed Worker. Please note that the port `8787` may be a different one in your case. -You should be seeing ‘Hello World!’ in your browser: +You should be seeing 'Hello World!' in your browser: ```sh Hello World! From 8b99f53657c541d48d6af146d6ef8cb5eb9ab143 Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 22:00:36 +0100 Subject: [PATCH 03/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index c187905ef2412b..e2ba052f44e3df 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -99,7 +99,7 @@ You will be asked to enter a secret value, which will be the value of the field :::note -Don’t include any double quotes in the Secret that you store, as it will be already interpreted as a string. +Don't include any double quotes in the Secret that you store, as it will be already interpreted as a string. ::: From d6d25fc8e97e7f95a2a96c7e4ba15c734b2a6b27 Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 22:00:54 +0100 Subject: [PATCH 04/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index e2ba052f44e3df..037885879a0add 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -167,7 +167,7 @@ Your worker has access to the following bindings: If you open `http://localhost:8787` in your browser, you should see the values of the variables show up in your console where the `npx wrangler dev` command is running, while still seeing only the `Hello World!` text in the browser window. -You now have access to the GCP credentials from a Worker. Next, you will install a library to help with the creation of the JSON Web Token needed to interact with GCP’s API. +You now have access to the GCP credentials from a Worker. Next, you will install a library to help with the creation of the JSON Web Token needed to interact with GCP's API. ## Step 3 - Install library to handle JWT operations From a95737cc54be9a5b1e436a656fba5fe84dfff4c7 Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 22:01:08 +0100 Subject: [PATCH 05/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 037885879a0add..692418586d5ca0 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -171,7 +171,7 @@ You now have access to the GCP credentials from a Worker. Next, you will install ## Step 3 - Install library to handle JWT operations -To interact with BigQuery’s REST API, you will need to generate a [JSON Web Token](https://jwt.io/introduction) to authenticate your requests using the credentials that you’ve loaded into Worker Secrets in the previous step. +To interact with BigQuery's REST API, you will need to generate a [JSON Web Token](https://jwt.io/introduction) to authenticate your requests using the credentials that you've loaded into Worker Secrets in the previous step. For this tutorial, you will be using the [jose](https://www.npmjs.com/package/jose?activeTab=readme) library for JWT-related operations. Install it by running the following command in a console: From 19a03632010a293e5f596a58ff62814200cd1085 Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 22:07:41 +0100 Subject: [PATCH 06/36] removing non-standard quotes from using-bigquery-with-workers-ai tutorial --- .../tutorials/using-bigquery-with-workers-ai.mdx | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 692418586d5ca0..671c412fb5fdac 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -35,7 +35,7 @@ You will be needing: ## Step 1 - Setting up your Cloudflare Worker -In order to ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you haven’t created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). +In order to ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you haven't created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). After following the steps to create a Worker, you should have the following code in your new Worker project: @@ -55,7 +55,7 @@ If the Worker project has successfully been created, you should also be able to Open a browser tab at `http://localhost:8787/` to see your deployed Worker. Please note that the port `8787` may be a different one in your case. -You should be seeing 'Hello World!' in your browser: +You should be seeing `Hello World!` in your browser: ```sh Hello World! @@ -192,7 +192,7 @@ To verify that the installation succeeded, you can run `npm list`, which list li ## Step 4 - Generate JSON Web Token -Now that you have installed the `jose` library, it’s time to import it and add a function to your code that generates a signed JWT: +Now that you have installed the `jose` library, it's time to import it and add a function to your code that generates a signed JWT: ```javascript import * as jose from 'jose'; @@ -234,11 +234,11 @@ export default { ``` -Now that you have created a JWT, it’s time to do an API call to BigQuery to fetch some data. +Now that you have created a JWT, it's time to do an API call to BigQuery to fetch some data. ## Step 5 - Making authenticated requests to Google BigQuery -With the JWT token created in the previous step, issue an API request to BigQuery’s API to retrieve data from a table. +With the JWT token created in the previous step, issue an API request to BigQuery's API to retrieve data from a table. You will now query the table that you already have created in BigQuery as part of the prerequisites of this tutorial. This example uses a sampled version of the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) that was used under its MIT licence and uploaded to BigQuery. @@ -277,7 +277,7 @@ Having the raw row data from BigQuery means that you can now format it in a JSON ## Step 6 - Formatting results from the query -Now that you have retrieved the data from BigQuery, it’s time to note that a BigQuery API response looks something like this: +Now that you have retrieved the data from BigQuery, it's time to note that a BigQuery API response looks something like this: ```json { @@ -656,7 +656,7 @@ This will create a public endpoint that you can use to access the Worker globall ## Conclusion -In this tutorial, you’ve learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker Secrets. This was later imported in the code and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. +In this tutorial, you've learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker Secrets. This was later imported in the code and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. Once you obtained the results, some formatting was applied to them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. From 26a216497b0b76736d5035d3a3507634a5a32c63 Mon Sep 17 00:00:00 2001 From: Alejandro Krumkamp Date: Mon, 21 Oct 2024 22:11:06 +0100 Subject: [PATCH 07/36] removing duplicated word from tutorial using-bigquery-with-workers-ai --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 671c412fb5fdac..b500773c4248a5 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -179,7 +179,7 @@ For this tutorial, you will be using the [jose](https://www.npmjs.com/package/jo npm i jose ``` -To verify that the installation succeeded, you can run `npm list`, which list list all the installed packages and see if the `jose` dependency has been added: +To verify that the installation succeeded, you can run `npm list`, which lists all the installed packages and see if the `jose` dependency has been added: ```sh @0.0.0 From a5b55e2b6a99ef3aa6db4b60de82cd06a93cd312 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 16:27:00 +0100 Subject: [PATCH 08/36] Changed steps details --- .../using-bigquery-with-workers-ai.mdx | 334 ++++++++++-------- 1 file changed, 177 insertions(+), 157 deletions(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index b500773c4248a5..9dc2463e45a10b 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -14,6 +14,7 @@ languages: sidebar: order: 2 --- + ## TL;DR You can skip this tutorial and check its [resulting code](#final-result). @@ -28,12 +29,13 @@ In this tutorial, you will learn how to bring data from Google BigQuery to a Clo ## Prerequisites -You will be needing: -- A [Cloudflare Worker](/workers/) project running a [Hello World script](/workers/get-started/guide/). +You will be needing: + +- A [Cloudflare Worker](/workers/) project running a [Hello World script](/workers/get-started/guide/). - A Google Cloud Platform [service account](https://cloud.google.com/iam/docs/service-accounts-create#iam-service-accounts-create-console) with an [associated key](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) file downloaded that has read access to BigQuery. - Access to a BigQuery table with some test data that allows you to create a [BigQuery Job Query](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query). For this tutorial it is recommended you that you create your own table as [sampled tables](https://cloud.google.com/bigquery/public-data#sample_tables), unless cloned to your own GCP namespace, won't allow you to run job queries against them. For this example, the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) was used under its MIT licence. -## Step 1 - Setting up your Cloudflare Worker +## 1. Setting up your Cloudflare Worker In order to ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you haven't created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). @@ -42,7 +44,7 @@ After following the steps to create a Worker, you should have the following code ```javascript export default { async fetch(request, env, ctx) { - return new Response('Hello World!'); + return new Response("Hello World!"); }, }; ``` @@ -63,7 +65,7 @@ Hello World! If you are running into any issues during this step, please make sure to review [Worker's Get Started Guide](/workers/get-started/guide/). -## Step 2 - Import GCP Service key into the Worker as Secrets +## 2. Import GCP Service key into the Worker as Secrets Now that you have verified that the Worker has been created successfully, you will need to reference the Google Cloud Platform service key created in the [Prerequisites](#prerequisites) section of this tutorial. @@ -71,21 +73,21 @@ Your downloaded key JSON file from Google Cloud Platform should have the followi ```json { - "type": "service_account", - "project_id": "", - "private_key_id": "", - "private_key": "", - "client_email": "@.iam.gserviceaccount.com", - "client_id": "", - "auth_uri": "https://accounts.google.com/o/oauth2/auth", - "token_uri": "https://oauth2.googleapis.com/token", - "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", - "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/%40.iam.gserviceaccount.com", - "universe_domain": "googleapis.com" + "type": "service_account", + "project_id": "", + "private_key_id": "", + "private_key": "", + "client_email": "@.iam.gserviceaccount.com", + "client_id": "", + "auth_uri": "https://accounts.google.com/o/oauth2/auth", + "token_uri": "https://oauth2.googleapis.com/token", + "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", + "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/%40.iam.gserviceaccount.com", + "universe_domain": "googleapis.com" } ``` -For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id` and `project_id`. +For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id` and `project_id`. Instead of storing this information in plain text in the Worker, you will use [Secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself. @@ -95,7 +97,7 @@ Import those three values from the JSON file into Secrets, starting with the fie npx wrangler secret put BQ_CLIENT_EMAIL ``` -You will be asked to enter a secret value, which will be the value of the field `client_email` in the JSON key file. +You will be asked to enter a secret value, which will be the value of the field `client_email` in the JSON key file. :::note @@ -123,12 +125,12 @@ npx wrangler secret put BQ_PRIVATE_KEY_ID npx wrangler secret put BQ_PROJECT_ID ``` - At this point, you have successfully imported three fields from the JSON key file downloaded from Google Cloud Platform into Cloudflare Secrets to be used in a Worker. -[Secrets](/workers/configuration/secrets/) are only made available to Workers once they are deployed. To make them available during development, [create a `.dev.vars`](/workers/configuration/secrets/#local-development-with-secrets) file to locally store these credentials and reference them as environment variables. +[Secrets](/workers/configuration/secrets/) are only made available to Workers once they are deployed. To make them available during development, [create a `.dev.vars`](/workers/configuration/secrets/#local-development-with-secrets) file to locally store these credentials and reference them as environment variables. Your `dev.vars` file should look like the following: + ``` BQ_CLIENT_EMAIL="@.iam.gserviceaccount.com" BQ_CLIENT_KEY="-----BEGIN PRIVATE KEY----------END PRIVATE KEY-----\n" @@ -143,11 +145,11 @@ Check that secrets are loaded correctly in `src/index.js` by logging their value ```javascript export default { async fetch(request, env, ctx) { - console.log("BQ_CLIENT_EMAIL: ", env.BQ_CLIENT_EMAIL); - console.log("BQ_PRIVATE_KEY: ", env.BQ_PRIVATE_KEY); - console.log("BQ_PRIVATE_KEY_ID: ", env.BQ_PRIVATE_KEY_ID); - console.log("BQ_PROJECT_ID: ", env.BQ_PROJECT_ID); - return new Response('Hello World!'); + console.log("BQ_CLIENT_EMAIL: ", env.BQ_CLIENT_EMAIL); + console.log("BQ_PRIVATE_KEY: ", env.BQ_PRIVATE_KEY); + console.log("BQ_PRIVATE_KEY_ID: ", env.BQ_PRIVATE_KEY_ID); + console.log("BQ_PROJECT_ID: ", env.BQ_PROJECT_ID); + return new Response("Hello World!"); }, }; ``` @@ -169,7 +171,7 @@ If you open `http://localhost:8787` in your browser, you should see the values o You now have access to the GCP credentials from a Worker. Next, you will install a library to help with the creation of the JSON Web Token needed to interact with GCP's API. -## Step 3 - Install library to handle JWT operations +## 3. Install library to handle JWT operations To interact with BigQuery's REST API, you will need to generate a [JSON Web Token](https://jwt.io/introduction) to authenticate your requests using the credentials that you've loaded into Worker Secrets in the previous step. @@ -179,10 +181,10 @@ For this tutorial, you will be using the [jose](https://www.npmjs.com/package/jo npm i jose ``` -To verify that the installation succeeded, you can run `npm list`, which lists all the installed packages and see if the `jose` dependency has been added: +To verify that the installation succeeded, you can run `npm list`, which lists all the installed packages and see if the `jose` dependency has been added: ```sh -@0.0.0 +@0.0.0 // ├── @cloudflare/vitest-pool-workers@0.4.29 ├── jose@5.9.2 @@ -190,7 +192,7 @@ To verify that the installation succeeded, you can run `npm list`, which lists a └── wrangler@3.75.0 ``` -## Step 4 - Generate JSON Web Token +## 4. Generate JSON Web Token Now that you have installed the `jose` library, it's time to import it and add a function to your code that generates a signed JWT: @@ -236,9 +238,9 @@ export default { Now that you have created a JWT, it's time to do an API call to BigQuery to fetch some data. -## Step 5 - Making authenticated requests to Google BigQuery +## 5. Making authenticated requests to Google BigQuery -With the JWT token created in the previous step, issue an API request to BigQuery's API to retrieve data from a table. +With the JWT token created in the previous step, issue an API request to BigQuery's API to retrieve data from a table. You will now query the table that you already have created in BigQuery as part of the prerequisites of this tutorial. This example uses a sampled version of the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) that was used under its MIT licence and uploaded to BigQuery. @@ -275,7 +277,7 @@ export default { Having the raw row data from BigQuery means that you can now format it in a JSON-like style up next. -## Step 6 - Formatting results from the query +## 6. Formatting results from the query Now that you have retrieved the data from BigQuery, it's time to note that a BigQuery API response looks something like this: @@ -337,26 +339,26 @@ This format may be a bit harder to read and to work with iterating through resul ```javascript [ - { - title: "", - text: "" - }, - { - title: "", - text: "" - }, - { - title: "", - text: "" - } -] + { + title: "", + text: "", + }, + { + title: "", + text: "", + }, + { + title: "", + text: "", + }, +]; ``` Create a `formatRows` function that takes a number of rows and fields returned from the BigQuery response body and returns an array of results as objects with named fields. ```javascript const formatRows = (rowsWithoutFieldNames, fields) => { - // Depending on the position of each value, it is known what field you should assign to it. + // Depending on the position of each value, it is known what field you should assign to it. const fieldsByIndex = new Map(); // Load all fields name and have their index in the array result as their key @@ -398,7 +400,7 @@ export default { }; ``` -## Step 7 - Feeding data into Workers AI +## 7. Feeding data into Workers AI Now that you have converted the response from the BigQuery API into an array of results, generate some tags and attach an associated sentiment score using an LLM via [Workers AI](/workers-ai/): @@ -424,7 +426,7 @@ const getAIGeneratedContent = (data, env, aiHandler) => { } ... export default { - async fetch(request, env, ctx) { + async fetch(request, env, ctx) { ... let summaries, sentimentScores; try { @@ -506,138 +508,156 @@ The actual values and fields will mostly depend on the query made in Step 5 that All the code shown in the different steps are combined into the following code in `src/index.js`: ```javascript -import * as jose from 'jose' +import * as jose from "jose"; const generateBQJWT = async (env) => { - const algorithm = "RS256" - const audience = "https://bigquery.googleapis.com/" - const expiryAt = (new Date().valueOf() / 1000) - const privateKey = await jose.importPKCS8(env.BQ_PRIVATE_KEY, algorithm) + const algorithm = "RS256"; + const audience = "https://bigquery.googleapis.com/"; + const expiryAt = new Date().valueOf() / 1000; + const privateKey = await jose.importPKCS8(env.BQ_PRIVATE_KEY, algorithm); // Generate signed JSON Web Token (JWT) return new jose.SignJWT() - .setProtectedHeader({ - typ: 'JWT', - alg: algorithm, - kid: env.BQ_PRIVATE_KEY_ID - }) - .setIssuer(env.BQ_CLIENT_EMAIL) - .setSubject(env.BQ_CLIENT_EMAIL) - .setAudience(audience) - .setExpirationTime(expiryAt) - .setIssuedAt() - .sign(privateKey) -} + .setProtectedHeader({ + typ: "JWT", + alg: algorithm, + kid: env.BQ_PRIVATE_KEY_ID, + }) + .setIssuer(env.BQ_CLIENT_EMAIL) + .setSubject(env.BQ_CLIENT_EMAIL) + .setAudience(audience) + .setExpirationTime(expiryAt) + .setIssuedAt() + .sign(privateKey); +}; const queryBQ = async (bgJWT, path) => { - const bqEndpoint = `https://bigquery.googleapis.com${path}` - const query = 'SELECT text FROM hn.news_sampled LIMIT 3'; + const bqEndpoint = `https://bigquery.googleapis.com${path}`; + const query = "SELECT text FROM hn.news_sampled LIMIT 3"; const response = await fetch(bqEndpoint, { - method: "POST", - body: JSON.stringify({ - "query": query - }), - headers: { - Authorization: `Bearer ${bgJWT}` - } - }) - return response.json() -} + method: "POST", + body: JSON.stringify({ + query: query, + }), + headers: { + Authorization: `Bearer ${bgJWT}`, + }, + }); + return response.json(); +}; const formatRows = (rowsWithoutFieldNames, fields) => { // Index to fieldName - const fieldsByIndex = new Map() + const fieldsByIndex = new Map(); fields.forEach((field, index) => { - fieldsByIndex.set(index, field.name) - }) + fieldsByIndex.set(index, field.name); + }); - const rowsWithFieldNames = rowsWithoutFieldNames.map(row => { - // Map rows into an array of objects with field names - let newRow = {} - row.f.forEach((field, index) => { - const fieldName = fieldsByIndex.get(index) - if (fieldName) { - newRow = ({ ...newRow, [fieldName]: field.v }) - } - }) - return newRow - }) + const rowsWithFieldNames = rowsWithoutFieldNames.map((row) => { + // Map rows into an array of objects with field names + let newRow = {}; + row.f.forEach((field, index) => { + const fieldName = fieldsByIndex.get(index); + if (fieldName) { + newRow = { ...newRow, [fieldName]: field.v }; + } + }); + return newRow; + }); - return rowsWithFieldNames -} + return rowsWithFieldNames; +}; const generateTags = (data, env) => { return env.AI.run("@cf/meta/llama-3.1-8b-instruct", { - prompt: `Create three one-word tags for the following text. return only these three tags separated by a comma. don't return text that is not a category.Lowercase only. ${JSON.stringify(data)}`, - }) -} + prompt: `Create three one-word tags for the following text. return only these three tags separated by a comma. don't return text that is not a category.Lowercase only. ${JSON.stringify(data)}`, + }); +}; const generateSentimentScore = (data, env) => { return env.AI.run("@cf/meta/llama-3.1-8b-instruct", { - prompt: `return a float number between 0 and 1 measuring the sentiment of the following text. 0 being negative and 1 positive. return only the number, no text. ${JSON.stringify(data)}`, - }) -} + prompt: `return a float number between 0 and 1 measuring the sentiment of the following text. 0 being negative and 1 positive. return only the number, no text. ${JSON.stringify(data)}`, + }); +}; const getAIGeneratedContent = (data, env, aiHandler) => { - let results = data?.map(dataPoint => { - return aiHandler(dataPoint, env) - }) - return Promise.all(results) -} + let results = data?.map((dataPoint) => { + return aiHandler(dataPoint, env); + }); + return Promise.all(results); +}; export default { async fetch(request, env, ctx) { - - // Create JWT to authenticate the BigQuery API call - let bqJWT - try { - bqJWT = await generateBQJWT(env); - } catch (error) { - console.log(error) - return new Response('An error has ocurred while generating the JWT', { status: 500 }) - } - - // Fetch results from BigQuery - let ticketInfo - try { - ticketInfo = await queryBQ(bqJWT, `/bigquery/v2/projects/${env.BQ_PROJECT_ID}/queries`) - } catch (error) { - console.log(error) - return new Response('An error has occurred while querying BQ', { status: 500 }) - } - - // Transform output format into array of objects with named fields - let formattedResults - if ('rows' in ticketInfo) { - formattedResults = formatRows(ticketInfo.rows, ticketInfo.schema.fields) - } else if ('error' in ticketInfo) { - return new Response(ticketInfo.error.message, { status: 500 }) - } - - // Generate AI summaries and sentiment scores - let summaries, sentimentScores - try { - summaries = await getAIGeneratedContent(formattedResults, env, generateTags) - sentimentScores = await getAIGeneratedContent(formattedResults, env, generateSentimentScore) - } catch { - return new Response('There was an error while generating the text summaries or sentiment scores') - } - - // Add AI summaries and sentiment scores to previous results - formattedResults = formattedResults?.map((formattedResult, i) => { - if (sentimentScores[i].response && summaries[i].response) { - return { - ...formattedResult, - 'sentiment': parseFloat(sentimentScores[i].response).toFixed(2), - 'tags': summaries[i].response.split(',').map((result) => result.trim()) - } - } - }) - - const response = {data: formattedResults} - - return new Response(JSON.stringify(response), { headers: { "Content-Type": "application/json" } }) + // Create JWT to authenticate the BigQuery API call + let bqJWT; + try { + bqJWT = await generateBQJWT(env); + } catch (error) { + console.log(error); + return new Response("An error has ocurred while generating the JWT", { + status: 500, + }); + } + + // Fetch results from BigQuery + let ticketInfo; + try { + ticketInfo = await queryBQ( + bqJWT, + `/bigquery/v2/projects/${env.BQ_PROJECT_ID}/queries`, + ); + } catch (error) { + console.log(error); + return new Response("An error has occurred while querying BQ", { + status: 500, + }); + } + + // Transform output format into array of objects with named fields + let formattedResults; + if ("rows" in ticketInfo) { + formattedResults = formatRows(ticketInfo.rows, ticketInfo.schema.fields); + } else if ("error" in ticketInfo) { + return new Response(ticketInfo.error.message, { status: 500 }); + } + + // Generate AI summaries and sentiment scores + let summaries, sentimentScores; + try { + summaries = await getAIGeneratedContent( + formattedResults, + env, + generateTags, + ); + sentimentScores = await getAIGeneratedContent( + formattedResults, + env, + generateSentimentScore, + ); + } catch { + return new Response( + "There was an error while generating the text summaries or sentiment scores", + ); + } + + // Add AI summaries and sentiment scores to previous results + formattedResults = formattedResults?.map((formattedResult, i) => { + if (sentimentScores[i].response && summaries[i].response) { + return { + ...formattedResult, + sentiment: parseFloat(sentimentScores[i].response).toFixed(2), + tags: summaries[i].response.split(",").map((result) => result.trim()), + }; + } + }); + + const response = { data: formattedResults }; + + return new Response(JSON.stringify(response), { + headers: { "Content-Type": "application/json" }, + }); }, }; ``` @@ -656,13 +676,13 @@ This will create a public endpoint that you can use to access the Worker globall ## Conclusion -In this tutorial, you've learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker Secrets. This was later imported in the code and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. +In this tutorial, you've learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker Secrets. This was later imported in the code and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. -Once you obtained the results, some formatting was applied to them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. +Once you obtained the results, some formatting was applied to them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. ## Next Steps -If instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. It allows triggering the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). +If instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. It allows triggering the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). A use case to ingest data from other sources, like you did in this tutorial, is to create a RAG system. If this sounds relevant to you, please check out the tutorial [Build a Retrieval Augmented Generation (RAG) AI](/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai/). From 0752aed2909fe9177ffa1e3985bb5727ee232067 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 16:59:47 +0100 Subject: [PATCH 09/36] Changed steps details --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 4 ---- 1 file changed, 4 deletions(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 9dc2463e45a10b..dd0f4578bdcbc0 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -15,10 +15,6 @@ sidebar: order: 2 --- -## TL;DR - -You can skip this tutorial and check its [resulting code](#final-result). - ## Introduction The easiest way to get started with [Workers AI](/workers-ai/) is to try it out in the [Multi-modal Playground](https://multi-modal.ai.cloudflare.com/) and the [LLM playground](https://playground.ai.cloudflare.com/). If you decide that you want to integrate your code with Workers AI, you may then decide to then use its [REST API endpoints](/workers-ai/get-started/rest-api/) or via a [Worker binding](/workers-ai/configuration/bindings/). From 1d4a273dc0b02b327a7aeda0cbb90bc897428091 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:27:21 +0100 Subject: [PATCH 10/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Remove title Co-authored-by: Jun Lee --- .../docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 1 - 1 file changed, 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index dd0f4578bdcbc0..bed95be693a033 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -15,7 +15,6 @@ sidebar: order: 2 --- -## Introduction The easiest way to get started with [Workers AI](/workers-ai/) is to try it out in the [Multi-modal Playground](https://multi-modal.ai.cloudflare.com/) and the [LLM playground](https://playground.ai.cloudflare.com/). If you decide that you want to integrate your code with Workers AI, you may then decide to then use its [REST API endpoints](/workers-ai/get-started/rest-api/) or via a [Worker binding](/workers-ai/configuration/bindings/). From 3be5eac45249401827a4fb05794fbd0b44a41f2e Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:27:38 +0100 Subject: [PATCH 11/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx grammar Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index bed95be693a033..bfc7a705c5bd7c 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -24,7 +24,7 @@ In this tutorial, you will learn how to bring data from Google BigQuery to a Clo ## Prerequisites -You will be needing: +You will need: - A [Cloudflare Worker](/workers/) project running a [Hello World script](/workers/get-started/guide/). - A Google Cloud Platform [service account](https://cloud.google.com/iam/docs/service-accounts-create#iam-service-accounts-create-console) with an [associated key](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) file downloaded that has read access to BigQuery. From 72d4ed2ae6fa085f2049c010c44042686145f3f8 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:27:50 +0100 Subject: [PATCH 12/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx grammar Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index bfc7a705c5bd7c..a86ac7eb93b159 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -30,7 +30,7 @@ You will need: - A Google Cloud Platform [service account](https://cloud.google.com/iam/docs/service-accounts-create#iam-service-accounts-create-console) with an [associated key](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) file downloaded that has read access to BigQuery. - Access to a BigQuery table with some test data that allows you to create a [BigQuery Job Query](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query). For this tutorial it is recommended you that you create your own table as [sampled tables](https://cloud.google.com/bigquery/public-data#sample_tables), unless cloned to your own GCP namespace, won't allow you to run job queries against them. For this example, the [Hacker News Corpus](https://www.kaggle.com/datasets/hacker-news/hacker-news-corpus) was used under its MIT licence. -## 1. Setting up your Cloudflare Worker +## 1. Set up your Cloudflare Worker In order to ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you haven't created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). From 09e0ad501ef376cbd81fbecdc9c6123782a9ac40 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:27:59 +0100 Subject: [PATCH 13/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx grammar Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index a86ac7eb93b159..94db33bc673097 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -82,7 +82,7 @@ Your downloaded key JSON file from Google Cloud Platform should have the followi } ``` -For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id` and `project_id`. +For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id`, and `project_id`. Instead of storing this information in plain text in the Worker, you will use [Secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself. From fe72520e69d6899123be2862498b87e046d4d147 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:28:20 +0100 Subject: [PATCH 14/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx grammar Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 94db33bc673097..9fde7cfd592672 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -32,7 +32,7 @@ You will need: ## 1. Set up your Cloudflare Worker -In order to ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you haven't created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). +To ingest the data into Cloudflare and feed it into Workers AI, you will be using a [Cloudflare Worker](/workers/). If you have not created one yet, please feel free to review our [tutorial on how to get started](/workers/get-started/). After following the steps to create a Worker, you should have the following code in your new Worker project: From 779591c911a67ac5285d3950cd4f19d50a9d6bc0 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:28:33 +0100 Subject: [PATCH 15/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 9fde7cfd592672..fe989ed9e96dec 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -96,7 +96,7 @@ You will be asked to enter a secret value, which will be the value of the field :::note -Don't include any double quotes in the Secret that you store, as it will be already interpreted as a string. +Do not include any double quotes in the secret that you store, as the Secret will be already interpreted as a string. ::: From 0557ba52cc4f059dd0956f09ecc9eddc716c0551 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:28:53 +0100 Subject: [PATCH 16/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index fe989ed9e96dec..e7f041f6b77ccf 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -84,7 +84,7 @@ Your downloaded key JSON file from Google Cloud Platform should have the followi For this tutorial, you will only be needing the values of the following fields: `client_email`, `private_key`, `private_key_id`, and `project_id`. -Instead of storing this information in plain text in the Worker, you will use [Secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself. +Instead of storing this information in plain text in the Worker, you will use [secrets](/workers/configuration/secrets/) to make sure its unencrypted content is only accessible via the Worker itself. Import those three values from the JSON file into Secrets, starting with the field from the JSON key file called `client_email`, which we will now call `BQ_CLIENT_EMAIL` (you can use another variable name): From 6ee40bf3484def77e59cd94fe23667dcd720c5f2 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:29:03 +0100 Subject: [PATCH 17/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index e7f041f6b77ccf..c83d7e1994d700 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -100,7 +100,7 @@ Do not include any double quotes in the secret that you store, as the Secret wil ::: -If the Secret was uploaded successfully, the following message will be displayed: +If the secret was uploaded successfully, the following message will be displayed: ```sh ✨ Success! Uploaded secret BQ_CLIENT_EMAIL From d336c511f977203c9c45983c88a32c74988afef3 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:29:15 +0100 Subject: [PATCH 18/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index c83d7e1994d700..b745e18afd95ea 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -106,7 +106,7 @@ If the secret was uploaded successfully, the following message will be displayed ✨ Success! Uploaded secret BQ_CLIENT_EMAIL ``` -Now import the Secrets for the three remaining fields; `private_key`, `private_key_id` and `project_id` as `BQ_PRIVATE_KEY`, `BQ_PRIVATE_KEY_ID` and `BQ_PROJECT_ID` respectively: +Now import the secrets for the three remaining fields; `private_key`, `private_key_id`, and `project_id` as `BQ_PRIVATE_KEY`, `BQ_PRIVATE_KEY_ID`, and `BQ_PROJECT_ID` respectively: ```sh npx wrangler secret put BQ_PRIVATE_KEY From 22465b41265cf19716b8a580d7d384a23d672694 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:29:23 +0100 Subject: [PATCH 19/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index b745e18afd95ea..524913e1a4a59f 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -120,7 +120,7 @@ npx wrangler secret put BQ_PRIVATE_KEY_ID npx wrangler secret put BQ_PROJECT_ID ``` -At this point, you have successfully imported three fields from the JSON key file downloaded from Google Cloud Platform into Cloudflare Secrets to be used in a Worker. +At this point, you have successfully imported three fields from the JSON key file downloaded from Google Cloud Platform into Cloudflare secrets to be used in a Worker. [Secrets](/workers/configuration/secrets/) are only made available to Workers once they are deployed. To make them available during development, [create a `.dev.vars`](/workers/configuration/secrets/#local-development-with-secrets) file to locally store these credentials and reference them as environment variables. From 03956c1e9da673ab95847841ade4ad64d229d32f Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:29:32 +0100 Subject: [PATCH 20/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 524913e1a4a59f..46ebf313674fb2 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -168,7 +168,7 @@ You now have access to the GCP credentials from a Worker. Next, you will install ## 3. Install library to handle JWT operations -To interact with BigQuery's REST API, you will need to generate a [JSON Web Token](https://jwt.io/introduction) to authenticate your requests using the credentials that you've loaded into Worker Secrets in the previous step. +To interact with BigQuery's REST API, you will need to generate a [JSON Web Token](https://jwt.io/introduction) to authenticate your requests using the credentials that you have loaded into Worker secrets in the previous step. For this tutorial, you will be using the [jose](https://www.npmjs.com/package/jose?activeTab=readme) library for JWT-related operations. Install it by running the following command in a console: From b65c16b1bd1c121fc56c5a65ee9a7016d05b3d63 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:29:50 +0100 Subject: [PATCH 21/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 46ebf313674fb2..53e927058d599f 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -189,7 +189,7 @@ To verify that the installation succeeded, you can run `npm list`, which lists a ## 4. Generate JSON Web Token -Now that you have installed the `jose` library, it's time to import it and add a function to your code that generates a signed JWT: +Now that you have installed the `jose` library, it is time to import it and add a function to your code that generates a signed JWT: ```javascript import * as jose from 'jose'; From 8a4cde5b44818c22930c30f649b78ead743f7cba Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:50:46 +0100 Subject: [PATCH 22/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 53e927058d599f..a8a982cd63f19c 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -231,7 +231,7 @@ export default { ``` -Now that you have created a JWT, it's time to do an API call to BigQuery to fetch some data. +Now that you have created a JWT, it is time to do an API call to BigQuery to fetch some data. ## 5. Making authenticated requests to Google BigQuery From 12648bae8bebb2e3ee3d77e0a8f1bc69576f7f3c Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:00 +0100 Subject: [PATCH 23/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index a8a982cd63f19c..75c8a70caedbce 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -272,7 +272,7 @@ export default { Having the raw row data from BigQuery means that you can now format it in a JSON-like style up next. -## 6. Formatting results from the query +## 6. Format results from the query Now that you have retrieved the data from BigQuery, it's time to note that a BigQuery API response looks something like this: From 9c1c7a9f95fa80ac971041fa85ae8694002f4a11 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:09 +0100 Subject: [PATCH 24/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 75c8a70caedbce..4ba96e3814b364 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -233,7 +233,7 @@ export default { Now that you have created a JWT, it is time to do an API call to BigQuery to fetch some data. -## 5. Making authenticated requests to Google BigQuery +## 5. Make authenticated requests to Google BigQuery With the JWT token created in the previous step, issue an API request to BigQuery's API to retrieve data from a table. From ad9007042bf0f90f9a8c58042b0ea342cb74c77d Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:16 +0100 Subject: [PATCH 25/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 4ba96e3814b364..2282c49ef3f191 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -274,7 +274,7 @@ Having the raw row data from BigQuery means that you can now format it in a JSON ## 6. Format results from the query -Now that you have retrieved the data from BigQuery, it's time to note that a BigQuery API response looks something like this: +Now that you have retrieved the data from BigQuery, it is time to note that a BigQuery API response looks something like this: ```json { From ec98c7be8b855e0580827252778be952c2e8641b Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:25 +0100 Subject: [PATCH 26/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 2282c49ef3f191..839fafd259348e 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -330,7 +330,7 @@ Now that you have retrieved the data from BigQuery, it is time to note that a Bi } ``` -This format may be a bit harder to read and to work with iterating through results later on in this tutorial. You will now implement a function that maps the schema into each individual value, so the results look as the following instead, each row corresponding to an object within an array: +This format may be difficult to read and to work with when iterating through results, which will go on to do later in this tutorial. So you will now implement a function that maps the schema into each individual value, and the resulting output will be easier to read, as shown below. Each row corresponds to an object within an array. ```javascript [ From ac78c912f60ea022f43c2e7e380625d372a55bda Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:34 +0100 Subject: [PATCH 27/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 839fafd259348e..83a326b4ff945c 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -395,7 +395,7 @@ export default { }; ``` -## 7. Feeding data into Workers AI +## 7. Feed data into Workers AI Now that you have converted the response from the BigQuery API into an array of results, generate some tags and attach an associated sentiment score using an LLM via [Workers AI](/workers-ai/): From 27798173bf12b75b55464ad28d45dedd80538252 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:45 +0100 Subject: [PATCH 28/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 83a326b4ff945c..3d57613548bbe4 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -452,7 +452,7 @@ Uncomment the following lines from the `wrangler.toml` file in your project: binding = "AI" ``` -Restart the Worker that is running locally and after doing so, go to your application endpoint: +Restart the Worker that is running locally, and after doing so, go to your application endpoint: ```sh curl http://localhost:8787 From bd71fd1a4118e4151865c21c14c07fffb2aec6b7 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:51:54 +0100 Subject: [PATCH 29/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 3d57613548bbe4..fd7025212fa9f3 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -460,7 +460,7 @@ curl http://localhost:8787 It is likely that you will be asked to log in into your Cloudflare account and grant temporary access to Wrangler (the Cloudflare CLI) to use your account when using Worker AI. -Once you access `http://localhost:8787` you should see a similar output as the following one: +Once you access `http://localhost:8787` you should see an output similar to the following: ```sh { From 7d5ba629d56c7d25f34efc3749ea264d9255b4ae Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:03 +0100 Subject: [PATCH 30/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index fd7025212fa9f3..436f9d5d338994 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -458,7 +458,7 @@ Restart the Worker that is running locally, and after doing so, go to your appli curl http://localhost:8787 ``` -It is likely that you will be asked to log in into your Cloudflare account and grant temporary access to Wrangler (the Cloudflare CLI) to use your account when using Worker AI. +It is likely that you will be asked to log in to your Cloudflare account and grant temporary access to Wrangler (the Cloudflare CLI) to use your account when using Worker AI. Once you access `http://localhost:8787` you should see an output similar to the following: From 380dbf673a2b61b486ef6c7f368a06533c899f04 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:10 +0100 Subject: [PATCH 31/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 436f9d5d338994..c32b6b54875959 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -496,7 +496,7 @@ Once you access `http://localhost:8787` you should see an output similar to the } ``` -The actual values and fields will mostly depend on the query made in Step 5 that are then feed into the LLMs models. +The actual values and fields will mostly depend on the query made in Step 5 that are then fed into the LLMs models. ## Final result From 85cff32a6de1bea59bbef8735983021e53a64142 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:16 +0100 Subject: [PATCH 32/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index c32b6b54875959..78f23c2d77a119 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -667,7 +667,7 @@ Deployed triggers (x sec) Current Version ID: ``` -This will create a public endpoint that you can use to access the Worker globally. Please keep this in mind when using production data to make sure to include additional access controls in place. +This will create a public endpoint that you can use to access the Worker globally. Please keep this in mind when using production data, and make sure to include additional access controls in place. ## Conclusion From ff898b1f8348f0a898b559e05192d24a81677d9d Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:22 +0100 Subject: [PATCH 33/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 78f23c2d77a119..b7b729054524d1 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -671,7 +671,7 @@ This will create a public endpoint that you can use to access the Worker globall ## Conclusion -In this tutorial, you've learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker Secrets. This was later imported in the code and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. +In this tutorial, you have learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker secrets. This was later imported in the code, and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. Once you obtained the results, some formatting was applied to them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. From 44a65c5206e1f4ddd37cbe2970fad5f4c6f40568 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:33 +0100 Subject: [PATCH 34/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index b7b729054524d1..210ce492b9284d 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -673,7 +673,7 @@ This will create a public endpoint that you can use to access the Worker globall In this tutorial, you have learnt how to integrate Google BigQuery and Cloudflare Workers by creating a GCP service account key and storing part of it as Worker secrets. This was later imported in the code, and by using the `jose` npm library, you created a JSON Web Token to authenticate the API query to BigQuery. -Once you obtained the results, some formatting was applied to them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. +Once you obtained the results, you formatted them to later be passed to generative AI models via Workers AI to generate tags and to perform sentiment analysis on the extracted data. ## Next Steps From 5e52828e31a46da8ae27001e1241ca42e37b3fc2 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:42 +0100 Subject: [PATCH 35/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index 210ce492b9284d..f077d91acc0a44 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -677,7 +677,7 @@ Once you obtained the results, you formatted them to later be passed to generati ## Next Steps -If instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. It allows triggering the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). +If, instead of displaying the results of ingesting the data to the AI model in a browser, your workflow requires fetching and store data (for example in [R2](/r2/) or [D1](/d1/)) on regular intervals, you may want to consider adding a [scheduled handler](/workers/runtime-apis/handlers/scheduled/) for this Worker. It allows triggering the Worker with a predefined cadence via a [Cron Trigger](/workers/configuration/cron-triggers/). A use case to ingest data from other sources, like you did in this tutorial, is to create a RAG system. If this sounds relevant to you, please check out the tutorial [Build a Retrieval Augmented Generation (RAG) AI](/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai/). From cec7a84d416d8a765aea96c8e2ab853c05a05b82 Mon Sep 17 00:00:00 2001 From: daisyfaithauma Date: Wed, 23 Oct 2024 17:52:53 +0100 Subject: [PATCH 36/36] Update src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx Co-authored-by: Jun Lee --- .../workers-ai/tutorials/using-bigquery-with-workers-ai.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx index f077d91acc0a44..c6125ee5608094 100644 --- a/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx +++ b/src/content/docs/workers-ai/tutorials/using-bigquery-with-workers-ai.mdx @@ -681,4 +681,4 @@ If, instead of displaying the results of ingesting the data to the AI model in a A use case to ingest data from other sources, like you did in this tutorial, is to create a RAG system. If this sounds relevant to you, please check out the tutorial [Build a Retrieval Augmented Generation (RAG) AI](/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai/). -To learn more about what other AI models you can use at Cloudflare, please consider visiting the [Workers AI](/workers-ai) section of our docs. +To learn more about what other AI models you can use at Cloudflare, please visit the [Workers AI](/workers-ai) section of our docs.