Deploying an LLM as a service in an {ocp-name} AI cluster

The code suggestions from {mta-dl-full} differ based on the large language model (LLM) that you use. Therefore, you might want to use an LLM that caters to your specific requirements.

{mta-dl-plugin} integrates with LLMs that are deployed as a scalable service on {ocp-name} AI clusters. These deployments provide you with granular control over resources such as compute, cluster nodes, and auto-scaling graphics processing units (GPUs) while enabling you to use LLMs to resolve code issues at a large scale.

An example workflow for configuring an LLM service on {ocp-name} AI broadly requires the following configurations:

Installing and configuring the following infrastructure resources:
- Red Hat {ocp-name} cluster and installing the {ocp-name} AI Operator
- Configure a GPU machineset
- (Optional) Configure an auto scaler custom resource (CR) and a machine scaler CR
Configuring {ocp-name} AI platform
- Configure a data science project
- Configure a serving runtime
- Configure an accelerator profile
Deploying the LLM through {ocp-name} AI
- Uploading your model to an AWS compatible bucket
- Add a data connection
- Deploy the LLM in your {ocp-name} AI data science project
- Export the SSL certificate, OPENAI_API_BASE URL and other environment variables to access the LLM
Preparing the LLM for analysis
- Configure an OpenAI API key
- Update the OpenAI API key and the base URL in provider-settings.yaml.

See Configuring LLM provider settings to configure the base URL and the LLM API key in the {mta-dl-plugin} Visual Studio Code extension.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying an LLM as a service in an {ocp-name} AI cluster

FilesExpand file tree

con_llm-service-openshift-ai.adoc

Latest commit

History

con_llm-service-openshift-ai.adoc

File metadata and controls

Deploying an LLM as a service in an {ocp-name} AI cluster