Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 57 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,12 +99,30 @@ if __name__ == "__main__":
server.run(port=8000)
```

Now run the server via the command-line
Now run the server anywhere (local or cloud) via the command-line.

### Run locally

```bash
lightning serve api server.py
```

You can also run the server directly in Python:

```bash
python server.py
```


### Run on the cloud

Deploy the server to Lightning AI for fully managed hosting (autoscaling, security, etc...):

```bash
lightning serve api server.py --cloud
```

Learn more about deployment options and cloud hosting [here](https://lightning.ai/docs/litserve/features/deploy-on-cloud).

### Test the server
Run the auto-generated test client:
```bash
Expand All @@ -128,7 +146,7 @@ litgpt serve microsoft/phi-2
- LitAPI lets you easily build complex AI systems with one or more models ([docs](https://lightning.ai/docs/litserve/api-reference/litapi)).
- Use the setup method for one-time tasks like connecting models, DBs, and loading data ([docs](https://lightning.ai/docs/litserve/api-reference/litapi#setup)).
- LitServer handles optimizations like batching, GPU autoscaling, streaming, etc... ([docs](https://lightning.ai/docs/litserve/api-reference/litserver)).
- Self host on your own machines or use Lightning Studios for a fully managed deployment ([learn more](#hosting-options)).
- Self host on your machines or create a fully managed deployment with Lightning ([learn more](https://lightning.ai/docs/litserve/features/deploy-on-cloud)).

[Learn how to make this server 200x faster](https://lightning.ai/docs/litserve/home/speed-up-serving-by-200x).

Expand Down Expand Up @@ -163,6 +181,41 @@ Use LitServe to deploy any model or AI service: (Compound AI, Gen AI, classic ML

 


# Hosting options
LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.

Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.

 

<div align="center">
<a target="_blank" href="https://lightning.ai/docs/litserve/features/deploy-on-cloud">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/host-on-lightning.svg" alt="Host on Lightning"/>
</a>
</div>

&nbsp;

<div align='center'>

| Feature | Self Managed | Fully Managed on Lightning |
|----------------------------------|-----------------------------------|----------------------------|
| Deployment | ✅ Do it yourself deployment | ✅ One-button cloud deploy |
| Load balancing | ❌ | ✅ |
| Autoscaling | ❌ | ✅ |
| Scale to zero | ❌ | ✅ |
| Multi-machine inference | ❌ | ✅ |
| Authentication | ❌ | ✅ |
| Own VPC | ❌ | ✅ |
| AWS, GCP | ❌ | ✅ |
| Use your own cloud commits | ❌ | ✅ |

</div>

&nbsp;


# Features
State-of-the-art features:

Expand All @@ -173,6 +226,7 @@ State-of-the-art features:
✅ [Batching](https://lightning.ai/docs/litserve/features/batching)
✅ [Streaming](https://lightning.ai/docs/litserve/features/streaming)
✅ [Worker autoscaling](https://lightning.ai/docs/litserve/features/autoscaling)
✅ [Deploy with Lightning AI](https://lightning.ai/docs/litserve/features/deploy-on-cloud)
✅ [Self-host on your machines](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-your-own)
✅ [Host fully managed on Lightning AI](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-lightning-studios)
✅ [Serve all models: (LLMs, vision, etc.)](https://lightning.ai/docs/litserve/examples)
Expand Down Expand Up @@ -206,40 +260,8 @@ These results are for image and text classification ML tasks. The performance re

***💡 Note on LLM serving:*** For high-performance LLM serving (like Ollama/vLLM), integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm), or build your custom vLLM-like server with LitServe. Optimizations like kv-caching, which can be done with LitServe, are needed to maximize LLM performance.

&nbsp;

# Hosting options
LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.

Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.

&nbsp;

<div align="center">
<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litserve-hello-world">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/host-on-lightning.svg" alt="Host on Lightning"/>
</a>
</div>

&nbsp;

<div align='center'>

| Feature | Self Managed | Fully Managed on Studios |
|----------------------------------|-----------------------------------|-------------------------------------|
| Deployment | ✅ Do it yourself deployment | ✅ One-button cloud deploy |
| Load balancing | ❌ | ✅ |
| Autoscaling | ❌ | ✅ |
| Scale to zero | ❌ | ✅ |
| Multi-machine inference | ❌ | ✅ |
| Authentication | ❌ | ✅ |
| Own VPC | ❌ | ✅ |
| AWS, GCP | ❌ | ✅ |
| Use your own cloud commits | ❌ | ✅ |

</div>

&nbsp;

# Community
LitServe is a [community project accepting contributions](https://lightning.ai/docs/litserve/community) - Let's make the world's most advanced AI inference engine.
Expand Down
Loading