Skip to content

Commit 908b186

Browse files
committed
add lambda web adapter streaming api backend ex
1 parent f731aba commit 908b186

File tree

4 files changed

+223
-0
lines changed

4 files changed

+223
-0
lines changed
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Serverless Streaming with Lambda Web Adapter and Bedrock
2+
3+
This example demonstrates how to set up a serverless streaming service using AWS Lambda, Lambda Web Adapter, and Amazon Bedrock. The service can be easily consumed by any frontend application through simple GET requests, without the need for websockets.
4+
5+
## Overview
6+
7+
This project showcases:
8+
- Streaming responses from Amazon Bedrock (using Anthropic Claude v2 model)
9+
- Using FastAPI with AWS Lambda
10+
- Implementing Lambda Web Adapter for response streaming
11+
- Creating a Function URL that supports response streaming
12+
13+
The setup allows any frontend to consume the streaming service via GET requests to the Function URL.
14+
15+
## How It Works
16+
17+
1. A FastAPI application is set up to handle requests and interact with Bedrock.
18+
2. The application is packaged as a Docker image, including the Lambda Web Adapter.
19+
3. AWS SAM is used to deploy the Lambda function with the necessary configurations.
20+
4. A Function URL is created with response streaming enabled.
21+
5. Frontends can send GET requests to this URL to receive streamed responses.
22+
23+
## Key Components
24+
25+
### Dockerfile
26+
27+
```dockerfile
28+
FROM public.ecr.aws/docker/library/python:3.12.0-slim-bullseye
29+
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.8.4 /lambda-adapter /opt/extensions/lambda-adapter
30+
31+
WORKDIR /app
32+
ADD . .
33+
RUN pip install -r requirements.txt
34+
35+
CMD ["python", "main.py"]
36+
```
37+
38+
Notice that we only need to add the second line to install Lambda Web Adapter.
39+
40+
```dockerfile
41+
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.8.4 /lambda-adapter /opt/extensions/
42+
```
43+
44+
In the SAM template, we use an environment variable `AWS_LWA_INVOKE_MODE: RESPONSE_STREAM` to configure Lambda Web Adapter in response streaming mode. And adding a function url with `InvokeMode: RESPONSE_STREAM`.
45+
46+
```yaml
47+
FastAPIFunction:
48+
Type: AWS::Serverless::Function
49+
Properties:
50+
PackageType: Image
51+
MemorySize: 512
52+
Environment:
53+
Variables:
54+
AWS_LWA_INVOKE_MODE: RESPONSE_STREAM
55+
FunctionUrlConfig:
56+
AuthType: NONE
57+
InvokeMode: RESPONSE_STREAM
58+
Policies:
59+
- Statement:
60+
- Sid: BedrockInvokePolicy
61+
Effect: Allow
62+
Action:
63+
- bedrock:InvokeModelWithResponseStream
64+
Resource: '*'
65+
```
66+
67+
68+
## Build and deploy
69+
70+
Run the following commands to build and deploy this example.
71+
72+
```bash
73+
sam build --use-container
74+
sam deploy --guided
75+
```
76+
77+
78+
## Test the example
79+
80+
After the deployment completes, use the `FastAPIFunctionUrl` shown in the output messages to send get requests with your query to the /api/stream route.
81+
82+
83+
```python
84+
import requests
85+
from botocore.auth import SigV4Auth
86+
from botocore.awsrequest import AWSRequest
87+
import boto3
88+
import json
89+
import time
90+
91+
session = boto3.Session()
92+
credentials = session.get_credentials()
93+
region = 'us-east-1'
94+
95+
payload = {"query": query}
96+
97+
request = AWSRequest(
98+
method='GET',
99+
url=f'{func_url}/api/stream',
100+
data=json.dumps(payload),
101+
headers={'Content-Type': 'application/json'}
102+
)
103+
104+
SigV4Auth(credentials, "lambda", region).add_auth(request)
105+
buffer = ""
106+
response= requests.get(
107+
request.url,
108+
data=request.data,
109+
headers=dict(request.headers),
110+
stream=True
111+
)
112+
113+
for chunk in response.iter_content(chunk_size=64):
114+
print(chunk.decode('utf-8'), end='', flush=True)
115+
```
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
FROM public.ecr.aws/docker/library/python:3.12.0-slim-bullseye
2+
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.8.4 /lambda-adapter /opt/extensions/lambda-adapter
3+
4+
WORKDIR /app
5+
ADD . .
6+
RUN pip install -r requirements.txt
7+
8+
CMD ["python", "main.py"]
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
import boto3
2+
import json
3+
import os
4+
import uvicorn
5+
from fastapi import FastAPI
6+
from fastapi.responses import StreamingResponse
7+
from pydantic import BaseModel
8+
import asyncio
9+
10+
11+
BEDROCK_MODEL = os.environ.get(
12+
"BEDROCK_MODEL", "anthropic.claude-3-haiku-20240307-v1:0"
13+
)
14+
SYSTEM = os.environ.get("SYSTEM", "You are a helpful assistant.")
15+
16+
app = FastAPI()
17+
bedrock = boto3.Session().client("bedrock-runtime")
18+
19+
20+
# Define the request model
21+
class QueryRequest(BaseModel):
22+
query: str
23+
24+
25+
@app.get("/api/stream")
26+
async def api_stream(request: QueryRequest):
27+
if not request.query:
28+
return None
29+
30+
return StreamingResponse(
31+
bedrock_stream(request.query),
32+
media_type="text/event-stream",
33+
headers={
34+
"Cache-Control": "no-cache",
35+
"Connection": "keep-alive",
36+
},
37+
)
38+
39+
40+
async def bedrock_stream(query: str):
41+
instruction = f"""
42+
You are a helpful assistant. Please provide an answer to the user's query
43+
<query>{query}</query>.
44+
"""
45+
body = json.dumps(
46+
{
47+
"anthropic_version": "bedrock-2023-05-31",
48+
"max_tokens": 1024,
49+
"system": SYSTEM,
50+
"temperature": 0.1,
51+
"top_k": 10,
52+
"messages": [
53+
{
54+
"role": "user",
55+
"content": instruction,
56+
}
57+
],
58+
}
59+
)
60+
61+
response = bedrock.invoke_model_with_response_stream(
62+
modelId=BEDROCK_MODEL, body=body
63+
)
64+
65+
stream = response.get("body")
66+
if stream:
67+
for event in stream:
68+
chunk = event.get("chunk")
69+
if chunk:
70+
message = json.loads(chunk.get("bytes").decode())
71+
if message["type"] == "content_block_delta":
72+
yield message["delta"]["text"] or ""
73+
await asyncio.sleep(0.01)
74+
elif message["type"] == "message_stop":
75+
yield "\n"
76+
await asyncio.sleep(0.01)
77+
78+
79+
if __name__ == "__main__":
80+
uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", "8080")))
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
annotated-types==0.5.0
2+
anyio==3.7.1
3+
boto3==1.28.61
4+
botocore==1.31.61
5+
click==8.1.7
6+
exceptiongroup==1.1.3
7+
fastapi==0.109.2
8+
h11==0.14.0
9+
idna==3.7
10+
jmespath==1.0.1
11+
pydantic==2.4.2
12+
pydantic_core==2.10.1
13+
python-dateutil==2.8.2
14+
s3transfer==0.7.0
15+
six==1.16.0
16+
sniffio==1.3.0
17+
starlette==0.36.3
18+
typing_extensions==4.8.0
19+
urllib3==1.26.19
20+
uvicorn==0.23.2

0 commit comments

Comments
 (0)