You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+57-35Lines changed: 57 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,12 +99,30 @@ if __name__ == "__main__":
99
99
server.run(port=8000)
100
100
```
101
101
102
-
Now run the server via the command-line
102
+
Now run the server anywhere (local or cloud) via the command-line.
103
+
104
+
### Run locally
103
105
104
106
```bash
107
+
lightning serve api server.py
108
+
```
109
+
110
+
You can also run the server directly in Python:
111
+
112
+
```bash
105
113
python server.py
106
114
```
107
-
115
+
116
+
### Run on the cloud
117
+
118
+
Deploy the server to Lightning AI for fully managed hosting (autoscaling, security, etc...):
119
+
120
+
```bash
121
+
lightning serve api server.py --cloud
122
+
```
123
+
124
+
Learn more about deployment options and cloud hosting [here](https://lightning.ai/docs/litserve/features/deploy-on-cloud).
125
+
108
126
### Test the server
109
127
Run the auto-generated test client:
110
128
```bash
@@ -128,7 +146,7 @@ litgpt serve microsoft/phi-2
128
146
- LitAPI lets you easily build complex AI systems with one or more models ([docs](https://lightning.ai/docs/litserve/api-reference/litapi)).
129
147
- Use the setup method for one-time tasks like connecting models, DBs, and loading data ([docs](https://lightning.ai/docs/litserve/api-reference/litapi#setup)).
- Self host on your own machines or use Lightning Studios for a fully managed deployment ([learn more](#hosting-options)).
149
+
- Self host on your machines or create a fully managed deployment with Lightning ([learn more](https://lightning.ai/docs/litserve/features/deploy-on-cloud)).
132
150
133
151
[Learn how to make this server 200x faster](https://lightning.ai/docs/litserve/home/speed-up-serving-by-200x).
134
152
@@ -163,6 +181,41 @@ Use LitServe to deploy any model or AI service: (Compound AI, Gen AI, classic ML
163
181
164
182
165
183
184
+
185
+
# Hosting options
186
+
LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.
187
+
188
+
Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.
✅ [Deploy with Lightning AI](https://lightning.ai/docs/litserve/features/deploy-on-cloud)
176
230
✅ [Self-host on your machines](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-your-own)
177
231
✅ [Host fully managed on Lightning AI](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-lightning-studios)
178
232
✅ [Serve all models: (LLMs, vision, etc.)](https://lightning.ai/docs/litserve/examples)
@@ -206,40 +260,8 @@ These results are for image and text classification ML tasks. The performance re
206
260
207
261
***💡 Note on LLM serving:*** For high-performance LLM serving (like Ollama/vLLM), integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm), or build your custom vLLM-like server with LitServe. Optimizations like kv-caching, which can be done with LitServe, are needed to maximize LLM performance.
208
262
209
-
210
-
211
-
# Hosting options
212
-
LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.
213
-
214
-
Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.
| Deployment | ✅ Do it yourself deployment | ✅ One-button cloud deploy |
231
-
| Load balancing | ❌ | ✅ |
232
-
| Autoscaling | ❌ | ✅ |
233
-
| Scale to zero | ❌ | ✅ |
234
-
| Multi-machine inference | ❌ | ✅ |
235
-
| Authentication | ❌ | ✅ |
236
-
| Own VPC | ❌ | ✅ |
237
-
| AWS, GCP | ❌ | ✅ |
238
-
| Use your own cloud commits | ❌ | ✅ |
239
-
240
-
</div>
241
-
242
-
243
265
244
266
# Community
245
267
LitServe is a [community project accepting contributions](https://lightning.ai/docs/litserve/community) - Let's make the world's most advanced AI inference engine.
0 commit comments