Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 42 additions & 2 deletions src/content/docs/autorag/configuration/metadata-filtering.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ const answer = await env.AI.autorag("my-autorag").search({

You can currently filter by the `folder` and `timestamp` of an R2 object. Currently, custom metadata attributes are not supported.

### `folder`
### Folder

The directory to the object. For example, the `folder` of the object at `llama/logistics/llama-logistics.mdx` is `llama/logistics/`. Note that the `folder` does not include a leading `/`.

Note that `folder` filter only includes files exactly in that folder, so files in subdirectories are not included. For example, specifying `folder: "llama/"` will match files in `llama/` but does not match files in `llama/logistics`.

### `timestamp`
### Timestamp

The timestamp indicating when the object was last modified. Comparisons are supported using a 13-digit Unix timestamp (milliseconds), but values will be rounded to 10 digits (seconds). For example, `1735689600999` or `2025-01-01 00:00:00.999 UTC` will be rounded down to `1735689600000`, corresponding to `2025-01-01 00:00:00 UTC`.

Expand Down Expand Up @@ -91,6 +91,46 @@ Note the following limitations with the compound operators:
- Only the `eq` operator is allowed.
- All conditions must filter on the **same key** (for example, all on `folder`)

### "Starts with" filter for folders

You can use "starts with" filtering on the `folder` metadata attribute to search for all files and subfolders within a specific path.

For example, consider this file structure:
```
customer-a/profile.md
customer-a/contracts/property/contract-1.pdf
```

If you were to filter using an `eq` (equals) operator with `value: "customer-a/"`, it would only match files directly within that folder, like `profile.md`. It wouldn't include files in subfolders like `customer-a/contracts/`.

To recursively filter for all items starting with the path `customer-a/`, you can use the following compound filter:

```js
filters: {
type: "and",
filters: [
{
type: "gt",
key: "folder",
value: "customer-a//",
},
{
type: "lte",
key: "folder",
value: "customer-a/z",
},
],
},
```

This filter identifies paths starting with `customer-a/` by using:

- The `and` condition to combine the effects of the `gt` and `lte` conditions.
- The `gt` condition to include pathes greater than the `/` ASCII character.
- The `lte` condition to include pathes less than and including the lower case `z` ASCII character.

Together, these conditions effectively select paths that begin with the provided path value.

## Response

You can see the metadata attributes of your retrieved data in the response under the property `attributes` for each retrieved chunk. For example:
Expand Down
37 changes: 37 additions & 0 deletions src/content/docs/autorag/how-to/multitenancy.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,40 @@ const response = await env.AI.autorag("my-autorag").search({
```

To filter across multiple folders, or to add date-based filtering, you can use a compound filter with an array of [comparison filters](/autorag/configuration/metadata-filtering/#compound-filter).

## Tip: Use "Starts with" filter

While an `eq` filter targets files at the specific folder, you'll often want to retrieve all documents belonging to a tenant regardless if there are files in its subfolders. For example, all files in `customer-a/` with a structure like:

```
customer-a/profile.md
customer-a/contracts/property/contract-1.pdf
```

To achieve this [starts with](/autorag/configuration/metadata-filtering/#starts-with-filter-for-folders) behavior, use a compound filter like:

```js
filters: {
type: "and",
filters: [
{
type: "gt",
key: "folder",
value: "customer-a//",
},
{
type: "lte",
key: "folder",
value: "customer-a/z",
},
],
},
```

This filter identifies paths starting with `customer-a/` by using:

- The `and` condition to combine the effects of the `gt` and `lte` conditions.
- The `gt` condition to include pathes greater than the `/` ASCII character.
- The `lte` condition to include pathes less than and including the lower case `z` ASCII character.

With this filter you would capture both files `profile.md` and `contract-1.pdf`.
Loading