Skip to content

Commit ddcdc64

Browse files
authored
Merge pull request #96403 from rh-tokeefe/OLS-1790A
OLS-1790: Document the supported vLLM version in OLS
2 parents 1563963 + 96422d8 commit ddcdc64

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

modules/ols-large-language-model-requirements.adoc

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@
44

55
:_mod-docs-content-type: CONCEPT
66
[id="ols-large-language-model-requirements"]
7-
= Large Language Model (LLM) requirements
7+
= Large language model (LLM) requirements
88
:context: ols-large-language-model-requirements
99

10-
A large language model (LLM) is a type of machine learning model that can interpret and generate human-like language. When an LLM is used with a virtual assistant the LLM can interpret questions accurately and provide helpful answers in a conversational manner.
10+
A large language model (LLM) is a type of machine learning model that interprets and generates human-like language. When an LLM is used with a virtual assistant, the LLM can accurately interpret questions and provide helpful answers in a conversational manner.
1111

12-
The {ols-long} service must have access to an LLM provider. The service does not provide an LLM for you, so the LLM must be configured prior to installing the {ols-long} Operator.
12+
The {ols-long} service must have access to an LLM provider. The service does not provide an LLM for you, so you must configure the LLM prior to installing the {ols-long} Operator.
1313

1414
The {ols-long} service can rely on the following Software as a Service (SaaS) LLM providers:
1515

@@ -41,14 +41,17 @@ To use {azure-official} with {ols-official}, you need access to link:https://azu
4141

4242
{rhelai} is OpenAI API-compatible, and is configured in a similar manner as the OpenAI provider.
4343

44-
You can configure {rhelai} as the (Large Language Model) LLM provider.
44+
You can configure {rhelai} as the LLM provider.
4545

4646
Because the {rhel} is in a different environment than the {ols-long} deployment, the model deployment must allow access using a secure connection. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_enterprise_linux_ai/1.2/html-single/building_your_rhel_ai_environment/index#creating_secure_endpoint[Optional: Allowing access to a model from a secure endpoint].
4747

48+
{ols-long} version 1.0 and later supports vLLM Server version 0.8.4 and later. When self-hosting an LLM with {rhelai}, you can use vLLM Server as the inference engine.
4849

4950
[id="rhoai_{context}"]
5051
== {rhoai}
5152

5253
{rhoai} is OpenAI API-compatible, and is configured largely the same as the OpenAI provider.
5354

54-
You need a Large Language Model (LLM) deployed on the single model-serving platform of {rhoai} using the Virtual Large Language Model (vLLM) runtime. If the model deployment is in a different {ocp-short-name} environment than the {ols-long} deployment, the model deployment must include a route to expose it outside the cluster. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2-latest/html/serving_models/serving-large-models_serving-large-models#about-the-single-model-serving-platform_serving-large-models[About the single-model serving platform].
55+
You must deploy an LLM on the {rhoai} single-model serving platform that uses the Virtual Large Language Model (vLLM) runtime. If the model deployment resides in a different {ocp-short-name} environment than the {ols-long} deployment, include a route to expose the model deployment outside the cluster. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2-latest/html/serving_models/serving-large-models_serving-large-models#about-the-single-model-serving-platform_serving-large-models[About the single-model serving platform].
56+
57+
{ols-long} version 1.0 and later supports vLLM Server version 0.8.4 and later. When self-hosting an LLM with {rhoai}, you can use vLLM Server as the inference engine.

0 commit comments

Comments
 (0)