Skip to content

Commit dd6a3a0

Browse files
authored
[Doc] Convert docs to use colon fences (vllm-project#12471)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
1 parent a7e3eba commit dd6a3a0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+2091
-2080
lines changed

docs/requirements-docs.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
sphinx==6.2.1
2+
sphinx-argparse==0.4.0
23
sphinx-book-theme==1.0.1
34
sphinx-copybutton==0.5.2
4-
myst-parser==3.0.1
5-
sphinx-argparse==0.4.0
65
sphinx-design==0.6.1
76
sphinx-togglebutton==0.3.2
7+
myst-parser==3.0.1
88
msgspec
99
cloudpickle
1010

docs/source/api/engine/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@
88
.. currentmodule:: vllm.engine
99
```
1010

11-
```{toctree}
11+
:::{toctree}
1212
:caption: Engines
1313
:maxdepth: 2
1414

1515
llm_engine
1616
async_llm_engine
17-
```
17+
:::

docs/source/api/model/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22

33
## Submodules
44

5-
```{toctree}
5+
:::{toctree}
66
:maxdepth: 1
77

88
interfaces_base
99
interfaces
1010
adapters
11-
```
11+
:::

docs/source/api/multimodal/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,12 @@ Looking to add your own multi-modal model? Please follow the instructions listed
1717

1818
## Submodules
1919

20-
```{toctree}
20+
:::{toctree}
2121
:maxdepth: 1
2222

2323
inputs
2424
parse
2525
processing
2626
profiling
2727
registry
28-
```
28+
:::
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Offline Inference
22

3-
```{toctree}
3+
:::{toctree}
44
:caption: Contents
55
:maxdepth: 1
66

77
llm
88
llm_inputs
9-
```
9+
:::

docs/source/contributing/dockerfile/dockerfile.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,11 @@ The edges of the build graph represent:
1717

1818
- `RUN --mount=(.\*)from=...` dependencies (with a dotted line and an empty diamond arrow head)
1919

20-
> ```{figure} /assets/contributing/dockerfile-stages-dependency.png
20+
> :::{figure} /assets/contributing/dockerfile-stages-dependency.png
2121
> :align: center
2222
> :alt: query
2323
> :width: 100%
24-
> ```
24+
> :::
2525
>
2626
> Made using: <https://github.com/patrickhoefler/dockerfilegraph>
2727
>

docs/source/contributing/model/basic.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ First, clone the PyTorch model code from the source repository.
1010
For instance, vLLM's [OPT model](gh-file:vllm/model_executor/models/opt.py) was adapted from
1111
HuggingFace's [modeling_opt.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/opt/modeling_opt.py) file.
1212

13-
```{warning}
13+
:::{warning}
1414
Make sure to review and adhere to the original code's copyright and licensing terms!
15-
```
15+
:::
1616

1717
## 2. Make your code compatible with vLLM
1818

@@ -80,10 +80,10 @@ def forward(
8080
...
8181
```
8282

83-
```{note}
83+
:::{note}
8484
Currently, vLLM supports the basic multi-head attention mechanism and its variant with rotary positional embeddings.
8585
If your model employs a different attention mechanism, you will need to implement a new attention layer in vLLM.
86-
```
86+
:::
8787

8888
For reference, check out our [Llama implementation](gh-file:vllm/model_executor/models/llama.py). vLLM already supports a large number of models. It is recommended to find a model similar to yours and adapt it to your model's architecture. Check out <gh-dir:vllm/model_executor/models> for more examples.
8989

docs/source/contributing/model/index.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,24 +4,24 @@
44

55
This section provides more information on how to integrate a [PyTorch](https://pytorch.org/) model into vLLM.
66

7-
```{toctree}
7+
:::{toctree}
88
:caption: Contents
99
:maxdepth: 1
1010

1111
basic
1212
registration
1313
tests
1414
multimodal
15-
```
15+
:::
1616

17-
```{note}
17+
:::{note}
1818
The complexity of adding a new model depends heavily on the model's architecture.
1919
The process is considerably straightforward if the model shares a similar architecture with an existing model in vLLM.
2020
However, for models that include new operators (e.g., a new attention mechanism), the process can be a bit more complex.
21-
```
21+
:::
2222

23-
```{tip}
23+
:::{tip}
2424
If you are encountering issues while integrating your model into vLLM, feel free to open a [GitHub issue](https://github.com/vllm-project/vllm/issues)
2525
or ask on our [developer slack](https://slack.vllm.ai).
2626
We will be happy to help you out!
27-
```
27+
:::

docs/source/contributing/model/multimodal.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,9 @@ Further update the model as follows:
4848
return vision_embeddings
4949
```
5050

51-
```{important}
51+
:::{important}
5252
The returned `multimodal_embeddings` must be either a **3D {class}`torch.Tensor`** of shape `(num_items, feature_size, hidden_size)`, or a **list / tuple of 2D {class}`torch.Tensor`'s** of shape `(feature_size, hidden_size)`, so that `multimodal_embeddings[i]` retrieves the embeddings generated from the `i`-th multimodal data item (e.g, image) of the request.
53-
```
53+
:::
5454

5555
- Implement {meth}`~vllm.model_executor.models.interfaces.SupportsMultiModal.get_input_embeddings` to merge `multimodal_embeddings` with text embeddings from the `input_ids`. If input processing for the model is implemented correctly (see sections below), then you can leverage the utility function we provide to easily merge the embeddings.
5656

@@ -89,10 +89,10 @@ Further update the model as follows:
8989
+ class YourModelForImage2Seq(nn.Module, SupportsMultiModal):
9090
```
9191

92-
```{note}
92+
:::{note}
9393
The model class does not have to be named {code}`*ForCausalLM`.
9494
Check out [the HuggingFace Transformers documentation](https://huggingface.co/docs/transformers/model_doc/auto#multimodal) for some examples.
95-
```
95+
:::
9696

9797
## 2. Specify processing information
9898

@@ -120,8 +120,8 @@ When calling the model, the output embeddings from the visual encoder are assign
120120
containing placeholder feature tokens. Therefore, the number of placeholder feature tokens should be equal
121121
to the size of the output embeddings.
122122

123-
::::{tab-set}
124-
:::{tab-item} Basic example: LLaVA
123+
:::::{tab-set}
124+
::::{tab-item} Basic example: LLaVA
125125
:sync: llava
126126

127127
Looking at the code of HF's `LlavaForConditionalGeneration`:
@@ -254,12 +254,12 @@ def get_mm_max_tokens_per_item(self, seq_len: int) -> Mapping[str, int]:
254254
return {"image": self.get_max_image_tokens()}
255255
```
256256

257-
```{note}
257+
:::{note}
258258
Our [actual code](gh-file:vllm/model_executor/models/llava.py) is more abstracted to support vision encoders other than CLIP.
259-
```
260-
261259
:::
260+
262261
::::
262+
:::::
263263

264264
## 3. Specify dummy inputs
265265

@@ -315,17 +315,17 @@ def get_dummy_processor_inputs(
315315
Afterwards, create a subclass of {class}`~vllm.multimodal.processing.BaseMultiModalProcessor`
316316
to fill in the missing details about HF processing.
317317

318-
```{seealso}
318+
:::{seealso}
319319
[Multi-Modal Data Processing](#mm-processing)
320-
```
320+
:::
321321

322322
### Multi-modal fields
323323

324324
Override {class}`~vllm.multimodal.processing.BaseMultiModalProcessor._get_mm_fields_config` to
325325
return a schema of the tensors outputted by the HF processor that are related to the input multi-modal items.
326326

327-
::::{tab-set}
328-
:::{tab-item} Basic example: LLaVA
327+
:::::{tab-set}
328+
::::{tab-item} Basic example: LLaVA
329329
:sync: llava
330330

331331
Looking at the model's `forward` method:
@@ -367,13 +367,13 @@ def _get_mm_fields_config(
367367
)
368368
```
369369

370-
```{note}
370+
:::{note}
371371
Our [actual code](gh-file:vllm/model_executor/models/llava.py) additionally supports
372372
pre-computed image embeddings, which can be passed to be model via the `image_embeds` argument.
373-
```
374-
375373
:::
374+
376375
::::
376+
:::::
377377

378378
### Prompt replacements
379379

docs/source/contributing/model/registration.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,17 +17,17 @@ After you have implemented your model (see [tutorial](#new-model-basic)), put it
1717
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
1818
Finally, update our [list of supported models](#supported-models) to promote your model!
1919

20-
```{important}
20+
:::{important}
2121
The list of models in each section should be maintained in alphabetical order.
22-
```
22+
:::
2323

2424
## Out-of-tree models
2525

2626
You can load an external model using a plugin without modifying the vLLM codebase.
2727

28-
```{seealso}
28+
:::{seealso}
2929
[vLLM's Plugin System](#plugin-system)
30-
```
30+
:::
3131

3232
To register the model, use the following code:
3333

@@ -45,11 +45,11 @@ from vllm import ModelRegistry
4545
ModelRegistry.register_model("YourModelForCausalLM", "your_code:YourModelForCausalLM")
4646
```
4747

48-
```{important}
48+
:::{important}
4949
If your model is a multimodal model, ensure the model class implements the {class}`~vllm.model_executor.models.interfaces.SupportsMultiModal` interface.
5050
Read more about that [here](#supports-multimodal).
51-
```
51+
:::
5252

53-
```{note}
53+
:::{note}
5454
Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server.
55-
```
55+
:::

0 commit comments

Comments
 (0)