Lightning-AI · williamFalcon · Mar 28, 2025 · Mar 6, 2025 · Mar 6, 2025 · Mar 26, 2025
@@ -99,12 +99,30 @@ if __name__ == "__main__":
     server.run(port=8000)
 ```
 
-Now run the server via the command-line
+Now run the server anywhere (local or cloud) via the command-line.
+
+### Run locally
 
 ```bash
+lightning serve api server.py
+```
+
+You can also run the server directly in Python:
+
+```bash 
 python server.py
 ```
-
+
+### Run on the cloud
+
+Deploy the server to Lightning AI for fully managed hosting (autoscaling, security, etc...):
+
+```bash
+lightning serve api server.py --cloud
+```
+
+Learn more about deployment options and cloud hosting [here](https://lightning.ai/docs/litserve/features/deploy-on-cloud).
+
 ### Test the server
 Run the auto-generated test client:        
 ```bash
@@ -128,7 +146,7 @@ litgpt serve microsoft/phi-2
 - LitAPI lets you easily build complex AI systems with one or more models ([docs](https://lightning.ai/docs/litserve/api-reference/litapi)).
 - Use the setup method for one-time tasks like connecting models, DBs, and loading data ([docs](https://lightning.ai/docs/litserve/api-reference/litapi#setup)).        
 - LitServer handles optimizations like batching, GPU autoscaling, streaming, etc... ([docs](https://lightning.ai/docs/litserve/api-reference/litserver)).
-- Self host on your own machines or use Lightning Studios for a fully managed deployment ([learn more](#hosting-options)).         
+- Self host on your machines or create a fully managed deployment with Lightning ([learn more](https://lightning.ai/docs/litserve/features/deploy-on-cloud)).
 
 [Learn how to make this server 200x faster](https://lightning.ai/docs/litserve/home/speed-up-serving-by-200x).    
 
@@ -163,6 +181,41 @@ Use LitServe to deploy any model or AI service: (Compound AI, Gen AI, classic ML
 
 &nbsp;
 
+
+# Hosting options   
+LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.
+
+Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.   
+
+&nbsp;
+
+<div align="center">
+<a target="_blank" href="https://lightning.ai/docs/litserve/features/deploy-on-cloud">
+  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/host-on-lightning.svg" alt="Host on Lightning"/>
+</a>
+</div>
+
+&nbsp;
+
+<div align='center'>
+
+| Feature                          | Self Managed                      | Fully Managed on Lightning |
+|----------------------------------|-----------------------------------|----------------------------|
+| Deployment                       | ✅ Do it yourself deployment      | ✅ One-button cloud deploy  |
+| Load balancing                   | ❌                                | ✅                          |
+| Autoscaling                      | ❌                                | ✅                          |
+| Scale to zero                    | ❌                                | ✅                          |
+| Multi-machine inference          | ❌                                | ✅                          |
+| Authentication                   | ❌                                | ✅                          |
+| Own VPC                          | ❌                                | ✅                          |
+| AWS, GCP                         | ❌                                | ✅                          |
+| Use your own cloud commits       | ❌                                | ✅                          |
+
+</div>
+
+&nbsp;
+
+
 # Features
 State-of-the-art features:
 
@@ -173,6 +226,7 @@ State-of-the-art features:
 ✅ [Batching](https://lightning.ai/docs/litserve/features/batching)    
 ✅ [Streaming](https://lightning.ai/docs/litserve/features/streaming)    
 ✅ [Worker autoscaling](https://lightning.ai/docs/litserve/features/autoscaling)    
+✅ [Deploy with Lightning AI](https://lightning.ai/docs/litserve/features/deploy-on-cloud)    
 ✅ [Self-host on your machines](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-your-own)    
 ✅ [Host fully managed on Lightning AI](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-lightning-studios)  
 ✅ [Serve all models: (LLMs, vision, etc.)](https://lightning.ai/docs/litserve/examples)        
@@ -206,40 +260,8 @@ These results are for image and text classification ML tasks. The performance re
 
 ***💡 Note on LLM serving:*** For high-performance LLM serving (like Ollama/vLLM), integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm), or build your custom vLLM-like server with LitServe. Optimizations like kv-caching, which can be done with LitServe, are needed to maximize LLM performance.
 
-&nbsp; 
-
-# Hosting options   
-LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.
-
-Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.   
-
 &nbsp;
 
-<div align="center">
-<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litserve-hello-world">
-  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/host-on-lightning.svg" alt="Host on Lightning"/>
-</a>
-</div>
-
-&nbsp;
-
-<div align='center'>
-
-| Feature                          | Self Managed                      | Fully Managed on Studios            |
-|----------------------------------|-----------------------------------|-------------------------------------|
-| Deployment                       | ✅ Do it yourself deployment      | ✅ One-button cloud deploy          |
-| Load balancing                   | ❌                                | ✅                                  |
-| Autoscaling                      | ❌                                | ✅                                  |
-| Scale to zero                    | ❌                                | ✅                                  |
-| Multi-machine inference          | ❌                                | ✅                                  |
-| Authentication                   | ❌                                | ✅                                  |
-| Own VPC                          | ❌                                | ✅                                  |
-| AWS, GCP                         | ❌                                | ✅                                  |
-| Use your own cloud commits       | ❌                                | ✅                                  |
-
-</div>
-
-&nbsp;
 
 # Community
 LitServe is a [community project accepting contributions](https://lightning.ai/docs/litserve/community) - Let's make the world's most advanced AI inference engine.