You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/using-structured-outputs-in-vllm.md
+9-8Lines changed: 9 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,9 +10,10 @@ tags:
10
10
- opensource
11
11
- LLM
12
12
---
13
+
## Using structured outputs in vLLM
13
14
Generating predictable and reliable outputs from large language models (LLMs) can be challenging, especially when those outputs need to integrate seamlessly with downstream systems. Structured outputs solve this problem by enforcing specific formats, such as JSON, regex patterns, or even grammars. vLLM supported this since some time ago, but there were no documentation on how to use it, and that´s why I decided to do a contribution and write the Structured Outputs documentation page (https://docs.vllm.ai/en/latest/usage/structured_outputs.html).
14
15
15
-
# Why Structured Outputs?
16
+
###Why Structured Outputs?
16
17
17
18
LLMs are incredibly powerful, but their outputs can be inconsistent when a specific format is required. Structured outputs address this issue by restricting the model’s generated text to adhere to predefined rules or formats, ensuring:
18
19
@@ -24,7 +25,7 @@ Imagine we have an external system which receives a JSON with the all the detail
24
25
25
26
How these tools work? The idea is that we´ll be able to filter the list of possible next tokens to force that we are always generating a token that is valid for the desired output format.
26
27
27
-
# What is vLLM?
28
+
###What is vLLM?
28
29
29
30
vLLM is a state-of-the-art, open-source inference and serving engine for LLMs. It’s built for performance and simplicity, offering:
30
31
@@ -34,7 +35,7 @@ vLLM is a state-of-the-art, open-source inference and serving engine for LLMs. I
34
35
35
36
These optimizations make vLLM one of the fastest and most versatile engines for production environments.
36
37
37
-
# Structured outputs on vLLM
38
+
###Structured outputs on vLLM
38
39
39
40
vLLM extends the OpenAI API with additional parameters to enable structured outputs. These include:
40
41
@@ -45,7 +46,7 @@ vLLM extends the OpenAI API with additional parameters to enable structured outp
45
46
46
47
Here’s how each works, along with example outputs:
47
48
48
-
### **1. Guided Choice**
49
+
####**1. Guided Choice**
49
50
50
51
Simplest form of structured output, ensuring the response is one of a set of predefined options.
0 commit comments