Skip to content

Commit 5b65fa9

Browse files
added nbviewer and a new guide for RAG pipelines
1 parent 55d4850 commit 5b65fa9

File tree

4 files changed

+102
-1
lines changed

4 files changed

+102
-1
lines changed
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
import React from "react";
2+
3+
interface JupyterNotebookViewerProps {
4+
fileUrl: string;
5+
}
6+
7+
const JupyterNotebookViewer: React.FC<JupyterNotebookViewerProps> = ({ fileUrl }) => {
8+
const nbviewerUrl = `https://nbviewer.org/github/${encodeURIComponent(fileUrl)}`;
9+
10+
return (
11+
<div className="p-4">
12+
<iframe
13+
src={nbviewerUrl}
14+
width="100%"
15+
height="800px"
16+
style={{ border: "none" }}
17+
/>
18+
</div>
19+
);
20+
};
21+
22+
export default JupyterNotebookViewer;

pages/spicedb/ops/_meta.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,6 @@
22
"observability": "Observability Tooling",
33
"deploying-spicedb-operator": "Deploying the SpiceDB Operator",
44
"deploying-spicedb-on-eks": "Deploying SpiceDB on Amazon EKS",
5-
"bulk-operations": "Bulk Importing Relationships"
5+
"bulk-operations": "Bulk Importing Relationships",
6+
"secure-rag-pipelines": "Secure Your RAG Pipelines with Fine Grained Authorization"
67
}
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
import JupyterNotebookViewer from "@/components/JupyterNotebookViewer";
2+
3+
# Secure Your RAG Pipelines With Fine Grained Authorization
4+
5+
Here's how you can use SpiceDB to safeguard sensitive data in RAG pipelines.
6+
You will learn how to pre-filter and post-filter vector database queries with a list of authorized object IDs to improve security and efficiency.
7+
8+
This guide uses OpenAI, Pinecone, Langchain, Jupyter Notebook and SpiceDB
9+
10+
## Why is this important?
11+
12+
Building enterprise-ready AI poses challenges around data security, accuracy, scalability, and integration, especially in compliance-regulated industries like healthcare and finance.
13+
Firms are increasing efforts to mitigate risks associated with LLMs, particularly regarding sensitive data exfiltration of personally identifiable information and/or sensitive company data.
14+
The primary mitigation strategy is to build guardrails around Retrieval-Augmented Generation (RAG) to safeguard data while also optimizing query response quality and efficiency.
15+
16+
To enable precise guardrails, one must implement permissions systems with advanced fine grained authorization capabilities such as returning lists of authorized subjects and accessible resources.
17+
Such systems ensure timely access to authorized data while preventing exfiltration of sensitive information, making RAGs more efficient and improving performance at scale.
18+
19+
## Setup and Prerequisites
20+
21+
- Access to a [SpiceDB](https://authzed.com/spicedb) instance. You can find instructions for installing SpiceDB [here](https://authzed.com/docs/spicedb/getting-started/install/macos)
22+
- A [Pinecone account](https://www.pinecone.io/) and API key
23+
- An [OpenAI Platform account](https://platform.openai.com/docs/overview) and API key
24+
- [Jupyter Notebook](https://jupyter.org/) running locally
25+
26+
#### Running SpiceDB
27+
28+
Once you've installed SpiceDB, run a local instance with this command in your terminal:
29+
30+
`spicedb serve --grpc-preshared-key rag-rebac-walkthrough`
31+
32+
and you should see something like this that indicates an instance of SpiceDB is running locally:
33+
34+
```
35+
8:28PM INF configured logging async=false format=auto log_level=inf
36+
o provider=zerolog
37+
8:28PM INF GOMEMLIMIT is updated GOMEMLIMIT=25769803776 package=git
38+
hub.com/KimMachineGun/automemlimit/memlimit
39+
8:28PM INF configured opentelemetry tracing endpoint= insecure=fals
40+
e provider=none sampleRatio=0.01 service=spicedb v=0
41+
8:28PM WRN this version of SpiceDB is out of date. See: https://git
42+
hub.com/authzed/spicedb/releases/tag/v1.39.1 latest-released-versio
43+
n=v1.39.1 this-version=v1.37.2
44+
8:28PM INF configuration ClusterDispatchCacheConfig.CacheKindForTes
45+
ting=(empty) ClusterDispatchCacheConfig.Enabled=true ClusterDispatc
46+
8:28PM INF using memory datastore engine
47+
8:28PM WRN in-memory datastore is not persistent and not feasible t
48+
8:28PM INF configured namespace cache defaultTTL=0 maxCost="32 MiB"
49+
8:28PM INF schema watch explicitly disabled
50+
8:28PM INF configured dispatch cache defaultTTL=20600 maxCost="164
51+
8:28PM INF configured dispatcher balancerconfig={"loadBalancingConfig":[{"consistent-hashring":{"replicationFactor":100,"spread":1}}]} concurrency-limit-check-permission=50 concurrency-limit-lookup-resources=50 concurrency-limit-lookup-subjects=50 concurrency-limit-reachable-resources=50
52+
8:28PM INF grpc server started serving addr=:50051 insecure=true network=tcp service=grpc workers=0
53+
8:28PM INF running server datastore=*schemacaching.definitionCachingProxy
54+
8:28PM INF http server started serving addr=:9090 insecure=true service=metrics
55+
8:28PM INF telemetry reporter scheduled endpoint=https://telemetry.authzed.com interval=1h0m0s next=5m14s
56+
```
57+
58+
#### Download the Jupyter Notebook
59+
60+
Clone the `workshops` [repository](https://github.com/authzed/workshops/) to your system and type `cd secure-rag-pipelines` to enter the working directory.
61+
62+
Start the `01-rag.ipynb` Notebook locally by typing `jupyter 01-rag.ipynb` (or `python3 -m notebook`) in your terminal.
63+
64+
## Add Fine Grained Authorization
65+
66+
Here's the Jupyter Notebook with step-by-step instructions
67+
68+
<JupyterNotebookViewer fileUrl="authzed/workshops/blob/main/secure-rag-pipelines/01-rag.ipynb" />
69+
70+
## Using DeepSeek or Google Colab
71+
72+
If you want to replace the OpenAI LLM with the DeepSeek (or any other) LLM, [check out this branch](https://github.com/authzed/workshops/tree/deepseek).
73+
It follows similar steps as the above guide, but uses the DeepSeek LLM via [OpenRouter](https://openrouter.ai/)
74+
75+
To run through this workshop on a cloud notebook, [here's a branch](https://github.com/authzed/workshops/tree/google-colab) that uses Google Colab.
76+
Note that this guide requires an instance of SpiceDB running on [AuthZed Serverless](https://app.authzed.com/) for which you can create a free account.
77+
78+

public/images/.DS_Store

6 KB
Binary file not shown.

0 commit comments

Comments
 (0)