vLLM Tool Calling

Welcome to the vLLM Function Calling Quickstart!

Use this to quickly get a vLLM runtime with Function Calling enabled in your OpenShift AI environment, loading models directly from ModelCar containers.

To see how it's done, jump straight to installation.

Detailed description

The vLLM Function Calling Quickstart is a template for deploying vLLM with Function Calling enabled, integrated with ModelCar containerized models, within Red Hat OpenShift AI.

It’s designed for environments where you want to:

Enable LLMs to call external tools (Tool/Function Calling).
Serve LLMs (like Granite3, Llama3) directly from a container.
Easily customize your model deployments without needing cluster admin privileges.

Use this project to quickly spin up a powerful vLLM instance ready for function-enabled Agents or AI applications.

See it in action

Red Hat uses Arcade software to create interactive demos. Check out Function Calling Quickstart Example to see it live.

Architecture diagrams

References

The runtime is out of the box in RHOAI called vLLM ServingRuntime for KServe
Detailed guide and documentations is available in this article.
Code for testing the Function Calling in OpenShift AI is in github.com/rh-aiservices-bu/llm-on-openshift

NOTE: To find more patterns and pre-built ModelCar images, take a look at the Red Hat AI Services ModelCar Catalog repo on GitHub and the ModelCar Catalog registry on Quay.

Requirements

Minimum hardware requirements

8+ vCPUs with 4th Gen Intel® Xeon® Scalable processors or newer
24+ GiB RAM
Storage: 30Gi minimum in PVC (larger models may require more)

Optional, depending on selected hardware platform

1 GPU (NVIDIA L40, A10, or similar)
1 Intel® Gaudi® AI Accelerator

Required software

Red Hat OpenShift
Red Hat OpenShift AI 2.16+
Dependencies for Single-model server:
- Red Hat OpenShift Service Mesh
- Red Hat OpenShift Serverless

Required permissions

Standard user. No elevated cluster permissions required

Install

Please note before you start

This example was tested on Red Hat OpenShift 4.16.24 & Red Hat OpenShift AI v2.16.2.

Clone the repository

git clone https://github.com/rh-ai-quickstart/vllm-tool-calling.git && \
    cd vllm-tool-calling/

Create the project

PROJECT can be set to any value. This will also be used as the namespace.

export PROJECT="vllm-tool-calling-demo"

oc new-project $PROJECT

Specify your LLM and device:

MODEL: select from [granite3.2-8b, llama3.2-1b, llama3.2-3b]
DEVICE: select from [cpu, gpu, hpu]

Set variables to the selected options. Example is shown below.

export MODEL="granite3.2-8b"
export DEVICE="gpu"

Deploy the LLM on the target hardware:

oc apply -n $PROJECT -k vllm-tool-calling/$MODEL/$DEVICE

Check the deployment

From the OpenShift Console, go to the App Switcher / Waffle and go to the Red Hat OpenShift AI Console.
Once inside the dashboard, navigate to Data Science Projects -> vllm-tool-calling-demo (or what you called your ${PROJECT} if you changed from default):

Check the models deployed, and wait until you get the green tick in the Status, meaning that the model is deployed successfully:

Cleanup

To remove all deployed components:

oc delete -n $PROJECT -f vllm-tool-calling/$MODEL/$DEVICE

Delete the project:

oc delete project $PROJECT

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
assets/images		assets/images
vllm-tool-calling		vllm-tool-calling
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vLLM Tool Calling

Table of Contents

Detailed description

See it in action

Architecture diagrams

References

Requirements

Minimum hardware requirements

Optional, depending on selected hardware platform

Required software

Required permissions

Install

Clone the repository

Create the project

Check the deployment

Cleanup

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

License

rh-ai-quickstart/vllm-tool-calling

Folders and files

Latest commit

History

Repository files navigation

vLLM Tool Calling

Table of Contents

Detailed description

See it in action

Architecture diagrams

References

Requirements

Minimum hardware requirements

Optional, depending on selected hardware platform

Required software

Required permissions

Install

Clone the repository

Create the project

Check the deployment

Cleanup

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Packages