Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions public/__redirects
Original file line number Diff line number Diff line change
Expand Up @@ -2088,6 +2088,8 @@
/workers-ai/demos/* /workers-ai/guides/demos-architectures/:splat 301
/workers-ai/tutorials/* /workers-ai/guides/tutorials/:splat 301

# Workflows
/workflows/tutorials/ /workflows/examples 301

# Others
/logs/analytics-integrations/* /fundamentals/data-products/analytics-integrations/:splat 301
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
pcx_content_type: navigation
title: Overview
sidebar:
order: 4
group:
hideIndex: true
tableOfContents: false
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
---
pcx_content_type: navigation
title: Introduction to Workflows
sidebar:
order: 1
tableOfContents: false
description: |
Cloudflare Workflows provides durable execution capabilities, allowing developers to create reliable, repeatable workflows that run in the background. Workflows are designed to resume execution even if the underlying compute fails, ensuring that tasks complete eventually. They are built on top of Cloudflare Workers and handle scaling and provisioning automatically.
---

import { Render, Tabs, TabItem, Stream, Card } from "~/components";

<Tabs>
<TabItem label="Watch this episode">

Cloudflare Workflows provides durable execution capabilities, allowing developers to create reliable, repeatable workflows that run in the background. Workflows are designed to resume execution even if the underlying compute fails, ensuring that tasks complete eventually. They are built on top of Cloudflare Workers and handle scaling and provisioning automatically.

Workflows are triggered by events, such as Event Notifications consumed from a Queue, HTTP requests, another Worker, or even scheduled timers. Individual steps within a Workflow are designed as retriable units of work. The state is persisted between steps, allowing workflows to resume from the last successful step after failures. Workflows automatically emit metrics for each step, aiding in debugging and observability.

<Card>
<Stream
id="825b29fbf3c93d525735544f77aeb816"
title="Introduction to Workflows"
thumbnail="https://pub-d9bf66e086fb4b639107aa52105b49dd.r2.dev/Workflows-video-1.png"
showMoreVideos={false}
chapters={{
"Background": "0s",
"Workflows Introduction": "47s",
"Punderful, an app using all of the Cloudflare primitives": "1m10s",
"Vectorize": "2m35s",
"Workflow code in Action": "3m0s",
"Does it scale?": "7m0s",
"Conclusion and next video introduction": "7m15s"
}}
/>

**Related content**

If you want to dive into detail, refer to the following pages:

- [Source code for the Punderful repository](https://github.com/craigsdennis/punderful-workflows)
- [Cloudflare Workflows](/workflows/)
- [Cloudflare Workers AI](/workers-ai/)

</Card>
</TabItem>

<TabItem label="Step-by-step tutorial">

Punderful is a sample application that showcases the use of various Cloudflare primitives, including Workers, D1, Vectorize, Workers AI, and Workflows. The application displays a list of puns stored in a D1 database.

The homepage lists the latest puns stored in D1. The application also includes a semantic search feature powered by Vectorize. To perform a search:

1. Go to the Punderful search page.
2. Type a search query in the "Search for a pun..." input box.
3. Observe the search results appearing instantly below the search box.

To demonstrate adding a new pun:

1. Go to the Punderful creation page.
2. Enter a new pun in the "Enter your pun here..." textarea.
3. Observe the preview of the pun updating as you type.
4. Click the "Submit Pun" button.

When a new pun is submitted, it needs to be indexed in Vectorize for the semantic search to work. This indexing process involves creating embeddings from the pun text. This is a task suitable for background processing using Cloudflare Workflows, avoiding delays for the user in the request-response loop.

### Implementing a Workflow to Process New Puns

A workflow is implemented to handle the background processing required when a new pun is submitted.

#### Triggering the Workflow

When a new pun is submitted via the `/api/puns` endpoint, the data is first inserted into the D1 database. Then, a new Workflow instance is created and triggered to perform the subsequent background tasks.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/index.tsx#L165)

In this handler, `c.env.PUBLISH.create(crypto.randomUUID(), { punId, pun: payload.pun })` creates a new instance of the workflow bound as `PUBLISH`, assigns it a unique ID, and passes the `punId` and `pun` text as the payload.

#### Defining the Workflow Class

The workflow logic is defined in a class that extends `WorkflowEntrypoint`.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/workflows/publish.ts#L12)

The `run` method is the entrypoint for the workflow execution. It receives the `event` containing the payload and a `step` object to define individual, durable steps.

#### Workflow Steps

Each discrete, retriable task in the workflow is defined using `await step.do()`.

##### Content Moderation

Optionally, the workflow can perform content moderation using an external service like OpenAI's moderation API if an API key is available in the environment.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/workflows/publish.ts#L16)

This step calls the OpenAI moderation API. If the content is flagged as inappropriate, the pun's status is updated in the database, and a `NonRetryableError` is thrown. Throwing a `NonRetryableError` prevents the workflow from retrying this step, as the content is permanently deemed inappropriate.

##### Creating Embeddings

Next, create vector embeddings for the pun text using a Workers AI model.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/workflows/publish.ts#L34)

This step uses the `@cf/baai/bge-large-en-v1.5` model from Workers AI to generate a vector embedding for the `pun` text. The result (the embedding vector) is returned by the step and can be used in subsequent steps. `step.do()` ensures this step will be retried if it fails, guaranteeing that embeddings are eventually created.

##### Categorizing the Pun

Optionally, use a Workers AI language model to categorize the pun.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/workflows/publish.ts#L41)

This step uses the `@cf/meta/llama-3.1-8b-instruct` model with a specific system prompt to generate categories for the pun. The generated categories string is returned by the step. This step also benefits from `step.do()`'s reliability.

##### Adding Embeddings to Vectorize

Insert the created pun embedding and potentially categories embedding into the Vectorize database.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/workflows/publish.ts#L78)

This step uses `this.env.VECTORIZE.upsert()` to add the generated embeddings and associated metadata to the Vectorize database. This makes the pun searchable semantically. `step.do()` ensures this critical indexing step is completed reliably.

##### Updating Database Status

The final step updates the status of the pun in the D1 database to indicate that it has been published and processed by the workflow.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/7cec7f4bd7d6b17085cb6d6cb3e56b6a4b5b7c9d/src/workflows/publish.ts#L104)

This step updates the `status` column in the D1 database to 'published' for the corresponding pun ID. Once this step is complete, the pun is considered fully processed and ready to be displayed on the homepage.

#### Workflow Bindings

To make the `PublishWorkflow` class available to the main Worker and to provide access to necessary resources (like D1, AI, Vectorize), bindings are configured in the `wrangler.toml` file.

[See here](https://github.com/craigsdennis/punderful-workflows/blob/main/wrangler.toml)

This configuration defines a workflow named `publish`, binds it to the environment variable `PUBLISH`, and links it to the `PublishWorkflow` class in `src/index.ts`. It also shows bindings for Workers AI (`AI`) and Vectorize (`VECTORIZE`), which are accessed via `this.env` within the workflow.

### Vectorize for Semantic Search

Vectorize is a vector database used in this application to enable semantic search for puns. It stores the vector embeddings created by Workers AI. The search functionality queries this Vectorize index to find puns similar in meaning to the user's query.

The homepage displays recently published puns (status 'published'). The detail page for a specific pun displays "Similar Puns", which are found by querying Vectorize with the embedding of the current pun.

### Scalability

Cloudflare Workers and Workflows are designed to scale automatically based on demand, handling concurrent requests and background tasks efficiently without requiring manual provisioning.

</TabItem>

<TabItem label="Series overview">
<Render file="workflows-series-navigation" />
</TabItem>
</Tabs>
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
pcx_content_type: navigation
title: Monitor and batch your website data
sidebar:
order: 2
tableOfContents: false
description: |
Workflows can be used to process batches of data, ensuring each item in the batch goes through a defined process with reliable execution. This section demonstrates processing a batch of puns using the Punderful application as an example.

---

import { Render, Tabs, TabItem, Stream, Card } from "~/components";

<Tabs>
<TabItem label="Watch this episode">

Workflows can be used to process batches of data, ensuring each item in the batch goes through a defined process with reliable execution. This section demonstrates processing a batch of puns using the Punderful application as an example.

<Card>
<Stream
id="2c36852489758c056da930e8714b6e74"
title="Monitor and batch your website data"
thumbnail="https://pub-d9bf66e086fb4b639107aa52105b49dd.r2.dev/Workflows-video-2.png"
showMoreVideos={false}
chapters={{
"Introduction": "3s",
"Implementing Workflows with Puns Dataset": "1m29s",
"Deployment and Monitoring": "2m52s",
"Admin Dashboard and Further Insights": "4m0s"
}}
/>

**Related content**

If you want to dive into detail, refer to the following pages:

- [Source code for the Punderful repository](https://github.com/craigsdennis/punderful-workflows)
- [Cloudflare Workflows](/workflows/)
- [Cloudflare Workers AI](/workers-ai/)

</Card>
</TabItem>

<TabItem label="Step-by-step tutorial">

The Punderful application processes user-submitted puns by performing content moderation, creating embeddings, categorizing, and adding them to a vector store. This process is defined as a Workflow. To process a batch of existing puns (from an open-source dataset called OPun), a batch endpoint is created that iterates through the puns and triggers the defined Workflow for each one.

#### Batch Processing Code

The following code snippet shows the endpoint responsible for batch processing:

[See here](https://github.com/craigsdennis/punderful-workflows/tree/main/src/index.tsx#L291)

This code:

1. Fetches the list of puns from a JSON file (`puns.json`).
2. Logs the number of puns being processed.
3. Sets a user ID for tracking.
4. Loops through each pun.
5. Performs basic text cleaning on the pun.
6. Inserts the pun into the database (handled by `insertPun`).
7. Triggers the `PUBLISH` Workflow for each individual pun using `c.env.PUBLISH.create()`. The Workflow is given a unique ID using `crypto.randomUUID()`.

### Monitoring Workflow Instances via CLI

The Cloudflare Wrangler CLI provides commands to monitor and manage Workflows and their instances.

To list the available workflows associated with your account:

```bash
npx wrangler workflows list
```

To list the instances of a specific workflow (e.g., the `publish` workflow):

```bash
npx wrangler workflows instances list publish
```

This command will show a list of workflow instances, their status (Queued, Running, Completed, Errored), and timestamps.

To view the details of a specific workflow instance, including its steps and their status, duration, and output:

```bash
npx wrangler workflows instances describe publish <instance-id>
```

Replace `<instance-id>` with the actual ID of a running or completed instance from the `list` command output.

#### Example CLI Output (Describe Instance)

Describing a workflow instance provides a detailed breakdown of its execution:

```
Workflow Name: publish
Instance ID: oPun-batch-aea07d75-95fa-448f-9573-6e435388eff7
Version ID: 75665fce-24a1-4c83-a561-088aabc91e5f
Status: Completed
Trigger: API
Queued: 10/24/2024, 1:43:45 AM
Success: Yes
Start: 10/24/2024, 1:43:45 AM
End: 10/24/2024, 1:43:49 AM
Duration: 4 seconds
Last Successful Step: update-status-to-published-1
Steps:

Name: content-moderation-1
Type: Step
Start: 10/24/2024, 1:43:45 AM
End: 10/24/2024, 1:43:45 AM
Duration: 0 seconds
Success: Yes
Output: "true"
Config: {"retries":{"limit":5,"delay":1000,"backoff":"exponential"},"timeout":"10 minutes"}
Attempts:
Status: Completed
Start Time: Oct 23, 2024 6:44:57 PM
End Time: Oct 23, 2024 6:44:57 PM
Wall Time: 180 ms
... (additional steps like create-pun-embedding-1, categorize-pun-1, add-embeddings-to-vector-store-1, update-status-to-published-1)
```

This output shows the status, start/end times, duration, success status, and even the output and configuration for each step within the workflow instance.

### Monitoring Workflow Instances via Cloudflare Dashboard

You can also monitor Workflows and their instances directly in the Cloudflare Dashboard.

This dashboard view provides a user-friendly way to observe the progress of your batch jobs, identify failed instances, and inspect the execution details of each step.

</TabItem>

<TabItem label="Series overview">
<Render file="workflows-series-navigation" />
</TabItem>
</Tabs>
Loading
Loading