Skip to content

Commit 94f03fa

Browse files
authored
Update rtp-llm-chatbot.md
1 parent ae9814e commit 94f03fa

File tree

1 file changed

+10
-7
lines changed

1 file changed

+10
-7
lines changed

content/learning-paths/servers-and-cloud-computing/rtp-llm/rtp-llm-chatbot.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,11 @@ layout: learningpathall
77
---
88

99
## Before you begin
10-
The instructions in this Learning Path are for any Arm server running Ubuntu 22.04 LTS. You need an Arm server instance with at least four cores and 16GB of RAM to run this example. Configure disk storage up to at least 32 GB. The instructions have been tested on an Alibaba Cloud g8y.8xlarge instance.
10+
The instructions in this Learning Path are for any Arm Neoverse N2 or Neoverse V2 based server running Ubuntu 22.04 LTS. You need an Arm server instance with at least four cores and 16GB of RAM to run this example. Configure disk storage up to at least 32 GB. The instructions have been tested on an Alibaba Cloud g8y.8xlarge instance and an AWS Graviton4 r8g.8xlarge instance.
1111

1212
## Overview
1313

14-
Arm CPUs are widely used in traditional ML and AI use cases. In this Learning Path, you learn how to run generative AI inference-based use cases like a LLM chatbot on Arm-based CPUs. You do this by deploying the [Qwen2-0.5B-Instruct model](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on your Arm-based CPU using `rtp-llm`.
14+
Arm CPUs are widely used in traditional ML and AI use cases. In this Learning Path, you will learn how to run generative AI inference-based use case like a LLM chatbot on Arm-based CPUs. You do this by deploying the [Qwen2-0.5B-Instruct model](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on your Arm-based CPU using `rtp-llm`.
1515

1616
[rtp-llm](https://github.com/alibaba/rtp-llm) is an open source C/C++ project developed by Alibaba that enables efficient LLM inference on a variety of hardware.
1717

@@ -41,7 +41,7 @@ sudo apt install git -y
4141
sudo apt install build-essential -y
4242
```
4343

44-
Install `openblas` develop package and fix up header path in ubuntu system:
44+
Install `openblas` developmwnt package and fix the header paths:
4545

4646
```bash
4747
sudo apt install libopenblas-dev
@@ -61,13 +61,13 @@ cd rtp-llm
6161
git checkout 4656265
6262
```
6363

64-
Comment out deps/requirements_lock_torch_arm.txt line 7-10, due to some host not accessible from the Internet.
64+
Comment out the lines 7-10 in `deps/requirements_lock_torch_arm.txt` as some hosts are not accessible from the Internet.
6565

6666
```bash
6767
sed -i '7,10 s/^/#/' deps/requirements_lock_torch_arm.txt
6868
```
6969

70-
By default, `rtp-llm` builds for GPU only on Linux. You need to provide extra config `--config=arm` to build it for the Arm CPU that you run it on.
70+
By default, `rtp-llm` builds for GPU only on Linux. You need to provide extra config `--config=arm` to build it for the Arm CPU that you will run it on:
7171

7272
Configure and build:
7373

@@ -81,13 +81,13 @@ INFO: 10094 processes: 8717 internal, 1377 local.
8181
INFO: Build completed successfully, 10094 total actions
8282
```
8383

84-
Install built wheel package:
84+
Install the built wheel package:
8585

8686
```bash
8787
pip install bazel-bin/maga_transformer/maga_transformer-0.2.0-cp310-cp310-linux_aarch64.whl
8888
```
8989

90-
Create python-test.py that `rtp-llm` running the help command:
90+
Create a file named `python-test.py` in your `/tmp` directory with the contents below:
9191

9292
```python
9393
from maga_transformer.pipeline import Pipeline
@@ -132,6 +132,9 @@ async def main():
132132

133133
if __name__ == '__main__':
134134
asyncio.run(main())
135+
```
136+
137+
Now run this file:
135138

136139
```bash
137140
python /tmp/python-test.py

0 commit comments

Comments
 (0)