Skip to content

Commit 54ac998

Browse files
authored
Merge pull request #2928 from ealagna/cms/ealagna/hpe-dev-portal/blog/deploying-a-hugging-face-llm-in-hpe-private-cloud-ai
Create Blog “deploying-a-hugging-face-llm-in-hpe-private-cloud-ai”
2 parents 207f38c + ab4bdd8 commit 54ac998

11 files changed

+135
-0
lines changed
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
title: Deploying a Small Language Model in HPE Private Cloud AI using a Jupyter
3+
Notebook
4+
date: 2025-02-20T20:03:50.971Z
5+
author: Dave Wright and Elias Alagna
6+
authorimage: /img/Avatar1.svg
7+
disable: false
8+
tags:
9+
- AI
10+
- PCAI
11+
- vllm
12+
- SLM
13+
---
14+
Deploying new language models for for users to interact with can be challenging for beginners. HPE developed Private Cloud AI to help users set up and implement AI solutions quickly and easily.
15+
16+
In this post, we will show how to use the HPE Machine Learning Inference Service (MLIS) as a part of HPE Private Cloud AI to add a new packaged model from a Hugging Face repository and create an endpoint to query the model. This is done using a Jupyter Notebook.
17+
18+
### Prerequisites
19+
20+
This tutorial uses the [HPE Private Cloud AI](https://www.hpe.com/us/en/private-cloud-ai.html) (PCAI) platform. A PCAI system is required for these steps to work. It is assumed that the PCAI system is physically installed, patched and running with user accounts provisioned.
21+
22+
### Steps to deploy
23+
24+
First, you will need to choose a model to deploy. In this case, we've chosen a model hosted on Hugging Face called [SmolLM2 1.7B](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct). This is a compact model that can solve a wide range of problems even though it is relatively diminutive at 1.7B parameters.
25+
26+
### Launching the interface
27+
28+
![Computer screen showing the HPE Private Cloud AI user interface and the HPE MLIS tile is highlighted.](/img/mlis.png)
29+
30+
Next select "Add new model".
31+
32+
![Computer screen showing packaged AI models and a selection to add a new model.](/img/new-model.png)
33+
34+
This brings up the "Add new packaged model" dialog box. Fill in the the name of the model, storage requirements, and resources. We have reduced the default resources, given that this is a small model.
35+
36+
![Dialog box for defining a new packaged model.](/img/define-parameters.png)
37+
38+
Once the package is set up, you will receive a confirmation.
39+
40+
![Shows running packaged model.](/img/package-running.png)
41+
42+
With the new packaged model complete, you will need to deploy it for use. Select "create new deployment" from the HPE MLIS "Deployments" tab. Select submit when all tabs are filled out as shown below.
43+
44+
This will create an endpoint for use in the notebook and provide an API token.
45+
46+
![New deployment for AI model](/img/new-deployment.png)
47+
48+
When the process is complete, an endpoint will be provided.
49+
50+
![Endpoint provided by MLIS system](/img/endpoint.png)
51+
52+
Next up, let's take the now deployed model that's ready for inference and connect to it and interact with it from a Jupyter Notebook.
53+
54+
### Building the Jupyter Notebook
55+
56+
First, install `openai` if you do not already have it and import.
57+
58+
```python
59+
# vLLM Chat OpenAI
60+
# !pip intall openai
61+
from openai import OpenAI
62+
```
63+
64+
Then, using the endpoint and key generated by HPE MLIS, enter them into your Jupyter Notebook. Be sure to append /v1 to the URL.
65+
66+
```python
67+
# Grab endpoint URL and API key from MLIS, remember to include "/v1" for latest version of the OpenAI-compatible API
68+
model = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
69+
openai_api_base = "https://smollm2-1-7b-vllm-predictor-dave-wright-hpe-1073f7cd.hpepcai-ingress.pcai.hpecic.net/v1"
70+
openai_api_key = "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpYXQiOjE3Mzk5MzgzMzAsImlzcyI6ImFpb2xpQGhwZS5jb20iLCJzdWIiOiI5MjNhM2JhOC1mMGU4LTQxOTQtODNkMS05ZWY4NzNjZGYxOWYiLCJ1c2VyIjoiZGF2ZS53cmlnaHQtaHBlLmNvbSJ9.YwH9gGPxTWxy4RSdjnQA9-U3_u7P0OIcarqw25DV8bOiftU1L4IvvyERHspj2lMGtZWbff1F3uh84wjAePHaHDcDTLoGtq6gJYwo_qRU03xV8Q2lwBetCCLUE4OHqS608gjJ-j1SLyqwxFxlXkqMOtnBY5_nswlAwCzHV28P8u8XxxfWuXFmoJpSA1egCWVVfEoTuK8CTz9kUJJ5opSp6m8qdqJmC2qxH0igcpKmL2H_MZ-62UHfEf240VRtc0DRNlOjeCoDM79aVPs3SjCtGeVkeEHimJwJbfGFIcu3LibX3QjbABUzWb5BPPZjzyEYUVM5ak12_sJ8j1mUW-r0sA"
71+
```
72+
73+
You will now need to create an OpenAI client interface.
74+
75+
```python
76+
# create OpenAI client interface
77+
client = OpenAI(
78+
api_key=openai_api_key,
79+
base_url=openai_api_base,
80+
)
81+
```
82+
83+
In order to interact with the model, you will need to create a chat function. For the purposes of our example, let's give it a history feature as well as basic chat.
84+
85+
```python
86+
# Interactive chat function with message history.
87+
def chat():
88+
# Initialize conversation history
89+
messages = []
90+
91+
print("Chat with "+model+"! Type 'quit' to exit.")
92+
93+
while True:
94+
# Get user input
95+
user_input = input("\nYou: ").strip()
96+
97+
# Check for quit command
98+
if user_input.lower() == 'quit':
99+
print("Goodbye!")
100+
break
101+
102+
# Add user message to history
103+
messages.append({"role": "user", "content": user_input})
104+
105+
try:
106+
# Get model response using chat completion
107+
response = client.chat.completions.create(
108+
model=model,
109+
messages=messages
110+
)
111+
112+
# Extract assistant's message
113+
assistant_message = response.choices[0].message.content
114+
115+
# Add assistant's response to history
116+
messages.append({"role": "assistant", "content": assistant_message})
117+
118+
# Print the response
119+
print("\nAssistant:", assistant_message)
120+
121+
except Exception as e:
122+
print(f"\nError: {str(e)}")
123+
```
124+
125+
![Jupyter Notebook showing imported model endpoint and API key.](/img/jupyter.png)
126+
127+
Once this is done, you can interact with the model through a simple chat.
128+
129+
![Interaction with the SmolLM2 Small Language Model in a Jupyter Notebook](/img/chat-interface.png)
130+
131+
You can access [this link](https://www.youtube.com/watch?v=oqjc-2c1Vtk) to see a recorded demonstration that shows this process in real time. [](https://www.youtube.com/watch?v=oqjc-2c1Vtk)
132+
133+
### Summary
134+
135+
With HPE Private Cloud AI, loading new models into the system and providing endpoints is just a few simple clicks and easily integrates with popular tools like Jupyter Notebooks. To learn more about HPE Private Cloud AI, please visit: <https://www.hpe.com/us/en/private-cloud-ai.html>

static/img/add-new-model.png

204 KB
Loading

static/img/chat-interface.png

195 KB
Loading

static/img/define-parameters.png

266 KB
Loading

static/img/endpoint.png

-38.5 KB
Loading

static/img/hpe-mlis.png

193 KB
Loading

static/img/jupyter.png

255 KB
Loading

static/img/mlis.png

122 KB
Loading

static/img/new-deployment.png

459 KB
Loading

static/img/new-model.png

129 KB
Loading

0 commit comments

Comments
 (0)