Skip to content

Commit 28dea46

Browse files
committed
Update tutorials
1 parent fde7e52 commit 28dea46

File tree

4 files changed

+183
-199
lines changed

4 files changed

+183
-199
lines changed

docs/docs/tutorials/deployment/deploy_dspy_model.md

Lines changed: 0 additions & 197 deletions
This file was deleted.
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Tutorial: Deploying your DSPy program
2+
3+
This guide demonstrates two potential ways to deploy your DSPy program in production: FastAPI for lightweight deployments and MLflow for more production-grade deployments with program versioning and management.
4+
5+
Below, we'll assume you have the following simple DSPy program that you want to deploy. You can replace this with something more sophisticated.
6+
7+
```python
8+
import dspy
9+
10+
dspy.settings.configure(lm=dspy.LM("openai/gpt-4o-mini"))
11+
dspy_program = dspy.ChainOfThought("question -> answer")
12+
```
13+
14+
## Deploying with FastAPI
15+
16+
FastAPI offers a straightforward way to serve your DSPy program as a REST API. This is ideal when you have direct access to your program code and need a lightweight deployment solution.
17+
18+
```bash
19+
> pip install fastapi uvicorn
20+
> export OPENAI_API_KEY="your-openai-api-key"
21+
```
22+
23+
Let's create a FastAPI application to serve your `dspy_program` defined above.
24+
25+
```python
26+
from fastapi import FastAPI, HTTPException
27+
from pydantic import BaseModel
28+
29+
import dspy
30+
31+
app = FastAPI(
32+
title="DSPy Program API",
33+
description="A simple API serving a DSPy Chain of Thought program",
34+
version="1.0.0"
35+
)
36+
37+
# Define request model for better documentation and validation
38+
class Question(BaseModel):
39+
text: str
40+
41+
# Configure your language model and 'asyncify' your DSPy program.
42+
lm = dspy.LM("openai/gpt-4o-mini")
43+
dspy.settings.configure(lm=lm, async_max_workers=4) # default is 8
44+
dspy_program = dspy.ChainOfThought("question -> answer")
45+
dspy_program = dspy.asyncify(dspy_program)
46+
47+
@app.post("/predict")
48+
async def predict(question: Question):
49+
try:
50+
result = await dspy_program(question=question.text)
51+
return {
52+
"status": "success",
53+
"data": result.toDict()
54+
}
55+
except Exception as e:
56+
raise HTTPException(status_code=500, detail=str(e))
57+
```
58+
59+
In the code above, we call `dspy.asyncify` to convert the dspy program to run in async mode for high-throughput FastAPI
60+
deployments. Currently, this runs the dspy program in a
61+
separate thread and awaits its result. By default, the limit of spawned threads is 8. Think of this like a worker pool.
62+
If you have 8 in-flight programs and call it once more, the 9th call will wait until one of the 8 returns.
63+
You can configure the async capacity using the new `async_max_workers` setting.
64+
65+
Write your code to a file, e.g., `fastapi_dspy.py`. Then you can serve the app with:
66+
67+
```bash
68+
> uvicorn fastapi_dspy:app --reload
69+
```
70+
71+
It will start a local server at `http://127.0.0.1:8000/`. You can test it with the python code below:
72+
73+
```python
74+
import requests
75+
76+
response = requests.post(
77+
"http://127.0.0.1:8000/predict",
78+
json={"text": "What is the capital of France?"}
79+
)
80+
print(response.json())
81+
```
82+
83+
You should see the response like below:
84+
85+
```json
86+
{'status': 'success', 'data': {'reasoning': 'The capital of France is a well-known fact, commonly taught in geography classes and referenced in various contexts. Paris is recognized globally as the capital city, serving as the political, cultural, and economic center of the country.', 'answer': 'The capital of France is Paris.'}}
87+
```
88+
89+
## Deploying with MLflow
90+
91+
We recommend deploying with MLflow if you are looking to package your DSPy program and deploy in an isolated environment.
92+
MLflow is a popular platform for managing machine learning workflows, including versioning, tracking, and deployment.
93+
94+
```bash
95+
> pip install mlflow>=2.18.0
96+
```
97+
98+
Let's spin up the MLflow tracking server, where we will store our DSPy program. The command below will start a local server at
99+
`http://127.0.0.1:5000/`.
100+
101+
```bash
102+
> mlflow ui
103+
```
104+
105+
Then we can define the DSPy program and log it to the MLflow server. "log" is an overloaded term in MLflow, basically it means
106+
we store the program information along with environment requirements in the MLflow server. See the code below:
107+
108+
```python
109+
import dspy
110+
import mlflow
111+
112+
mlflow.set_tracking_uri("http://127.0.0.1:5000/")
113+
mlflow.set_experiment("deploy_dspy_program")
114+
115+
lm = dspy.LM("openai/gpt-4o-mini")
116+
dspy.settings.configure(lm=lm)
117+
dspy_program = dspy.ChainOfThought("question -> answer")
118+
119+
with mlflow.start_run():
120+
mlflow.dspy.log_model(
121+
dspy_program,
122+
"dspy_program",
123+
input_example={"messages": [{"role": "user", "content": "What is LLM agent?"}]},
124+
task="llm/v1/chat",
125+
)
126+
```
127+
128+
We recommend you to set `task="llm/v1/chat"` so that the deployed program automatically takes input and generate output in
129+
the same format as the OpenAI chat API, which is a common interface for LM applications. Write the code above into
130+
a file, e.g. `mlflow_dspy.py`, and run it.
131+
132+
After you logged the program, you can view the saved information in MLflow UI. Open `http://127.0.0.1:5000/` and select
133+
the `deploy_dspy_program` experiment, then select the run your just created, under the `Artifacts` tab, you should see the
134+
logged program information, similar to the following screenshot:
135+
136+
![MLflow UI](./dspy_mlflow_ui.png)
137+
138+
Grab your run id from UI (or the console print when you execute `mlflow_dspy.py`), now you can deploy the logged program
139+
with the following command:
140+
141+
```bash
142+
> mlflow models serve -m runs:/{run_id}/model -p 6000
143+
```
144+
145+
After the program is deployed, you can test it with the following command:
146+
147+
```bash
148+
> curl http://127.0.0.1:6000/invocations -H "Content-Type:application/json" --data '{"messages": [{"content": "what is 2 + 2?", "role": "user"}]}'
149+
```
150+
151+
You should see the response like below:
152+
153+
```json
154+
{"choices": [{"index": 0, "message": {"role": "assistant", "content": "{\"reasoning\": \"The question asks for the sum of 2 and 2. To find the answer, we simply add the two numbers together: 2 + 2 = 4.\", \"answer\": \"4\"}"}, "finish_reason": "stop"}]}
155+
```
156+
157+
For complete guide on how to deploy a DSPy program with MLflow, and how to customize the deployment, please refer to the
158+
[MLflow documentation](https://mlflow.org/docs/latest/llms/dspy/index.html).
159+
160+
### Best Practices for MLflow Deployment
161+
162+
1. **Environment Management**: Always specify your Python dependencies in a `conda.yaml` or `requirements.txt` file.
163+
2. **Versioning**: Use meaningful tags and descriptions for your model versions.
164+
3. **Input Validation**: Define clear input schemas and examples.
165+
4. **Monitoring**: Set up proper logging and monitoring for production deployments.
166+
167+
For production deployments, consider using MLflow with containerization:
168+
169+
```bash
170+
> mlflow models build-docker -m "runs:/{run_id}/model" -n "dspy-program"
171+
> docker run -p 6000:8080 dspy-program
172+
```
173+
174+
For a complete guide on production deployment options and best practices, refer to the
175+
[MLflow documentation](https://mlflow.org/docs/latest/llms/dspy/index.html).

docs/docs/tutorials/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
You can browse existing tutorials in the navigation bar to the left.
2+
3+
We are working on upgrading more tutorials and other examples to [DSPy 2.5](https://github.com/stanfordnlp/dspy/blob/main/examples/migration.ipynb) from earlier DSPy versions.

docs/mkdocs.yml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,9 @@ nav:
4343
- WeaviateRM: deep-dive/retrieval_models_clients/WeaviateRM.md
4444
- YouRM: deep-dive/retrieval_models_clients/YouRM.md
4545
- Tutorials:
46-
- Simple RAG: tutorials/rag/index.ipynb
46+
- Tutorials Overview: tutorials/index.md
47+
- Retrieval-Augmented Generation: tutorials/rag/index.ipynb
48+
- Deployment: tutorials/deployment/index.md
4749
- Community:
4850
- Community Resources: community/community-resources.md
4951
- Use Cases: community/use-cases.md
@@ -101,7 +103,8 @@ plugins:
101103
- search
102104
- mkdocstrings
103105
- blog
104-
- mkdocs-jupyter
106+
- mkdocs-jupyter:
107+
ignore_h1_titles: True
105108
- redirects:
106109
redirect_maps:
107110
# Redirect /intro/ to the main page

0 commit comments

Comments
 (0)