Skip to content

Commit 2db6b2f

Browse files
authored
Merge branch 'ArmDeveloperEcosystem:main' into main
2 parents 7279432 + d9d42f2 commit 2db6b2f

File tree

5 files changed

+227
-95
lines changed

5 files changed

+227
-95
lines changed

content/learning-paths/servers-and-cloud-computing/ai-agent-on-cpu/_index.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
---
2-
title: How to run AI Agent Application on CPU with llama.cpp and llama-cpp-agent using KleidiAI
2+
title: Run an AI Agent Application with llama.cpp and llama-cpp-agent using KleidiAI on Arm servers.
33

44
minutes_to_complete: 45
55

6-
who_is_this_for: This Learning Path is for software developers, ML engineers, and those looking to run AI Agent Application locally.
6+
who_is_this_for: This is an introductory topic for software developers and ML engineers looking to run an AI Agent Application.
77

88
learning_objectives:
99
- Set up llama-cpp-python optimised for Arm servers.
10-
- Learn how to optimise LLM models to run locally.
11-
- Learn how to create custom tools for ML models.
10+
- Learn how to run optimized LLM models.
11+
- Learn how to create custom functions for LLMs.
1212
- Learn how to use AI Agents for applications.
1313

1414
prerequisites:
15-
- An AWS Gravition instance (m7g.xlarge)
15+
- An [Arm-based instance](/learning-paths/servers-and-cloud-computing/csp/) from a cloud service provider or an on-premise Arm server.
1616
- Basic understanding of Python and Prompt Engineering
1717
- Understanding of LLM fundamentals.
1818

@@ -25,7 +25,7 @@ armips:
2525
- Neoverse
2626
tools_software_languages:
2727
- Python
28-
- AWS Gravition
28+
- AWS Graviton
2929
operatingsystems:
3030
- Linux
3131

Lines changed: 146 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,101 @@
11
---
2-
title: AI Agent Overview and Test Results
2+
title: Understand and test the AI Agent
33
weight: 5
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
## Explain how LLM which function to use
9+
## AI Agent Function Calls
1010

11-
Below is a brief explanation of how LLM can be configured and used to execute Agent tasks.
11+
An AI agent, powered by a Large Language Model (LLM), decides which function to use by analyzing the prompt or input it receives, identifying the relevant intent or task, and then matching that intent to the most appropriate function from a pre-defined set of available functions based on its understanding of the language and context.
1212

13-
- This code creates an instance of the quantized `llama3.1` model for more efficient inference on Arm-based systems.
14-
```
13+
Lets look at how this is implemented in the python script `agent.py`.
14+
15+
- This code section of `agent.py` shown below creates an instance of the quantized `llama3.1 8B` model for more efficient inference on Arm-based systems.
16+
```output
1517
llama_model = Llama(
16-
model_path="./models/llama3.1-8b-instruct.Q4_0_arm.gguf",
18+
model_path="./models/dolphin-2.9.4-llama3.1-8b-Q4_0.gguf",
1719
n_batch=2048,
1820
n_ctx=10000,
1921
n_threads=64,
2022
n_threads_batch=64,
2123
)
2224
```
2325

24-
- Here, you define a provider that leverages the llama.cpp Python bindings.
25-
```
26+
- Next, you define a provider that leverages the `llama.cpp` Python bindings.
27+
```output
2628
provider = LlamaCppPythonProvider(llama_model)
2729
```
2830

29-
- The function’s docstring guides the LLM on when and how to invoke it.
30-
```
31-
def function(a,b):
32-
"""
33-
Description about when the function should be called
31+
- The LLM has access to certain tools or functions and can take a general user input and decide which functions to call. The function’s docstring guides the LLM on when and how to invoke it. In `agent.py` three such tools or functions are defined `open_webpage`, `get_current_time` and `calculator`
32+
33+
```output
34+
def open_webpage():
35+
"""
36+
Open Learning Path Website when user asks the agent regarding Arm Learning Path
37+
"""
38+
import webbrowser
39+
40+
url = "https://learn.arm.com/"
41+
webbrowser.open(url, new=0, autoraise=True)
42+
43+
44+
def get_current_time():
45+
"""
46+
Returns the current time in H:MM AM/PM format.
47+
"""
48+
import datetime # Import datetime module to get current time
49+
50+
now = datetime.datetime.now() # Get current time
51+
return now.strftime("%I:%M %p") # Format time in H:MM AM/PM format
52+
53+
54+
class MathOperation(Enum):
55+
ADD = "add"
56+
SUBTRACT = "subtract"
57+
MULTIPLY = "multiply"
58+
DIVIDE = "divide"
59+
def calculator(
60+
number_one: Union[int, float],
61+
number_two: Union[int, float],
62+
operation: MathOperation,
63+
) -> Union[int, float]:
64+
"""
65+
Perform a math operation on two numbers.
3466
3567
Args:
36-
a: description of the argument a
37-
b: description of the argument b
68+
number_one: First number
69+
number_two: Second number
70+
operation: Math operation to perform
3871
3972
Returns:
40-
Description about the function's output
41-
"""
73+
Result of the mathematical operation
4274
43-
# ... body of your function goes here
75+
Raises:
76+
ValueError: If the operation is not recognized
77+
"""
78+
if operation == MathOperation.ADD:
79+
return number_one + number_two
80+
elif operation == MathOperation.SUBTRACT:
81+
return number_one - number_two
82+
elif operation == MathOperation.MULTIPLY:
83+
return number_one * number_two
84+
elif operation == MathOperation.DIVIDE:
85+
return number_one / number_two
86+
else:
87+
raise ValueError("Unknown operation.")
4488
```
4589

4690
- `from_functions` creates an instance of `LlmStructuredOutputSettings` by passing in a list of callable Python functions. The LLM can then decide if and when to use these functions based on user queries.
47-
```
48-
LlmStructuredOutputSettings.from_functions([function1, function2, etc])
49-
```
5091

51-
- With this, the user’s prompt is collected and processed through `LlamaCppAgent`. The agent decides whether to call any defined functions based on the request.
92+
```output
93+
output_settings = LlmStructuredOutputSettings.from_functions(
94+
[get_current_time, open_webpage, calculator], allow_parallel_function_calling=True
95+
)
96+
97+
```
98+
- The user's prompt is then collected and processed through `LlamaCppAgent`. The agent decides whether to call any defined functions based on the request.
5299
```
53100
user = input("Please write your prompt here: ")
54101
@@ -64,27 +111,95 @@ result = llama_cpp_agent.get_chat_response(
64111
)
65112
```
66113

114+
## Test the AI Agent
67115

68-
## Example
116+
You are now ready to test and execute the AI Agent python script. Start the application:
117+
118+
```bash
119+
python3 agent.py
120+
```
121+
122+
You will see lots of interesting statistics being printed from `llama.cpp` about the model and the system, followed by the prompt as shown:
123+
124+
```output
125+
llama_kv_cache_init: CPU KV buffer size = 1252.00 MiB
126+
llama_init_from_model: KV self size = 1252.00 MiB, K (f16): 626.00 MiB, V (f16): 626.00 MiB
127+
llama_init_from_model: CPU output buffer size = 0.49 MiB
128+
llama_init_from_model: CPU compute buffer size = 677.57 MiB
129+
llama_init_from_model: graph nodes = 1030
130+
llama_init_from_model: graph splits = 1
131+
CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | MATMUL_INT8 = 1 | SVE = 1 | DOTPROD = 1 | MATMUL_INT8 = 1 | SVE_CNT = 32 | OPENMP = 1 | AARCH64_REPACK = 1 |
132+
Model metadata: {'tokenizer.chat_template': "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'tokenizer.ggml.eos_token_id': '128256', 'general.quantization_version': '2', 'tokenizer.ggml.model': 'gpt2', 'llama.vocab_size': '128258', 'general.file_type': '2', 'llama.attention.layer_norm_rms_epsilon': '0.000010', 'llama.rope.freq_base': '500000.000000', 'tokenizer.ggml.bos_token_id': '128000', 'llama.attention.head_count': '32', 'llama.feed_forward_length': '14336', 'general.architecture': 'llama', 'llama.attention.head_count_kv': '8', 'llama.block_count': '32', 'tokenizer.ggml.padding_token_id': '128004', 'general.basename': 'Meta-Llama-3.1', 'llama.embedding_length': '4096', 'general.base_model.0.organization': 'Meta Llama', 'tokenizer.ggml.pre': 'llama-bpe', 'llama.context_length': '131072', 'general.name': 'Meta Llama 3.1 8B', 'llama.rope.dimension_count': '128', 'general.base_model.0.name': 'Meta Llama 3.1 8B', 'general.organization': 'Meta Llama', 'general.type': 'model', 'general.size_label': '8B', 'general.base_model.0.repo_url': 'https://huggingface.co/meta-llama/Meta-Llama-3.1-8B', 'general.license': 'llama3.1', 'general.base_model.count': '1'}
133+
Available chat formats from metadata: chat_template.default
134+
Using gguf chat template: {% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '
135+
' + message['content'] + '<|im_end|>' + '
136+
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
137+
' }}{% endif %}
138+
Using chat eos_token: <|im_end|>
139+
Using chat bos_token: <|begin_of_text|>
140+
Please write your prompt here:
141+
```
69142

70-
- If the user asks, “What is the current time?”, the AI Agent will choose to call the `get_current_time()` function, returning a result in **H:MM AM/PM** format.
143+
## Test the AI agent
71144

72-
![Prompt asking for the current time](test_prompt.png)
145+
When you are presented with "Please write your prompt here:" test it with an input prompt. Enter "What is the current time?"
73146

74147
- As part of the prompt, a list of executable functions is sent to the LLM, allowing the agent to select the appropriate function:
75148

76-
![Display of available functions in the terminal](test_functions.png)
149+
```output
150+
Read and follow the instructions below:
77151
78-
- After the user prompt, the AI Agent decides to invoke the function and return thre result:
152+
<system_instructions>
153+
You're a helpful assistant to answer User query.
154+
</system_instructions>
79155
80-
![get_current_time function execution](test_output.png)
81156
157+
You can call functions to help you with your tasks and user queries. The available functions are:
82158
159+
<function_list>
160+
Function: get_current_time
161+
Description: Returns the current time in H:MM AM/PM format.
162+
Parameters:
163+
none
164+
165+
Function: open_webpage
166+
Description: Open Learning Path Website when user asks the agent regarding Arm Learning Path
167+
Parameters:
168+
none
169+
170+
Function: calculator
171+
Description: Perform a math operation on two numbers.
172+
Parameters:
173+
number_one (int or float): First number
174+
number_two (int or float): Second number
175+
operation (enum): Math operation to perform Can be one of the following values: 'add' or 'subtract' or 'multiply' or 'divide'
176+
</function_list>
177+
178+
To call a function, respond with a JSON object (to call one function) or a list of JSON objects (to call multiple functions), with each object containing these fields:
179+
180+
- "function": Put the name of the function to call here.
181+
- "arguments": Put the arguments to pass to the function here.
182+
```
183+
184+
The AI Agent then decides to invoke the appropriate function and return the result as shown:
185+
186+
```output
187+
[
188+
{
189+
"function":
190+
"get_current_time",
191+
"arguments": {}
192+
}
193+
]
194+
----------------------------------------------------------------
195+
Response from AI Agent:
196+
[{'function': 'get_current_time', 'arguments': {}, 'return_value': '07:58 PM'}]
197+
----------------------------------------------------------------
198+
```
83199

200+
You have now tested when you enter, "What is the current time?", the AI Agent will choose to call the `get_current_time()` function, and return a result in **H:MM AM/PM** format.
84201

85-
## Next Steps
86-
- You can ask different questions to trigger and execute other functions.
87-
- Extend your AI agent by defining custom functions so it can handle specific tasks. You can also re-enable the `TaviliySearchResults` function to unlock search capabilities within your environment.
202+
You have successfully run an AI agent. You can ask different questions to trigger and execute other functions. You can extend your AI agent by defining custom functions so it can handle specific tasks.
88203

89204

90205

content/learning-paths/servers-and-cloud-computing/ai-agent-on-cpu/ai-agent-backend.md

Lines changed: 6 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,15 @@
11
---
2-
title: Python Script to Execute the AI Agent Application
2+
title: AI Agent Application
33
weight: 4
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
## Python Script for AI Agent Application
10-
Once you set up the environment, create a Python script which will execute the AI Agent Applicaion:
9+
## Python Script for executing an AI Agent application
10+
With `llama.cpp` built and the Llama3.1 8B model downloaded, you are now ready to create a Python script to execute an AI Agent Application:
1111

12-
### Option A
13-
- Clone the repository
14-
```bash
15-
cd ~
16-
git clone https://github.com/jc2409/ai-agent.git
17-
```
18-
19-
### Option B
20-
- Creat a Python file:
21-
```bash
22-
cd ~
23-
touch agent.py
24-
```
25-
26-
- Copy and paste the following code:
12+
Create a Python file named `agent.py` with the content shown below:
2713
```bash
2814
from enum import Enum
2915
from typing import Union
@@ -47,7 +33,7 @@ from llama_cpp import Llama
4733
# os.environ.get("TAVILY_API_KEY")
4834

4935
llama_model = Llama(
50-
model_path="./models/llama3.1-8b-instruct.Q4_0_arm.gguf", # make sure you use the correct path for the quantized model
36+
model_path="./models/dolphin-2.9.4-llama3.1-8b-Q4_0.gguf", # make sure you use the correct path for the quantized model
5137
n_batch=2048,
5238
n_ctx=10000,
5339
n_threads=64,
@@ -170,16 +156,5 @@ def run_web_search_agent():
170156
if __name__ == '__main__':
171157
run_web_search_agent()
172158
```
159+
In the next section, you will inspect this script to understand how the LLM is configured and used to execute Agent tasks using this script. You will then proceed to executing and testing the AI Agent.
173160
174-
## Run the Python Script
175-
176-
You are now ready to test the AI Agent. Use the following command in a terminal to start the application:
177-
```bash
178-
python3 agent.py
179-
```
180-
181-
{{% notice Note %}}
182-
183-
If it takes too long to process, try to terminate the application and try again.
184-
185-
{{% /notice %}}

content/learning-paths/servers-and-cloud-computing/ai-agent-on-cpu/ai-agent.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,27 @@ weight: 2
66
layout: learningpathall
77
---
88

9-
## Defining AI Agents
9+
## Overview of AI Agents
1010

1111
An AI Agent is best understood as an integrated system that goes beyond standard text generation by equipping Large Language Models (LLMs) with tools and domain knowledge. Here’s a closer look at the underlying elements:
1212

1313
- **System**: Each AI Agent functions as an interconnected ecosystem of components.
1414
- **Environment**: The domain in which the AI Agent operates. For instance, in a system that books travel itineraries, the relevant environment might include airline reservation systems and hotel booking tools.
15-
- **Sensors**: Methods the AI Agent uses to observe its surroundings. In the travel scenario, these could be APIs that inform the agent about seat availability on flights or room occupancy in hotels.
15+
- **Sensors**: Methods the AI Agent uses to observe its surroundings. For a travel agent, these could be APIs that inform the agent about seat availability on flights or room occupancy in hotels.
1616
- **Actuators**: Ways the AI Agent exerts influence within that environment. In the example of a travel agent, placing a booking or modifying an existing reservation serves as the agent’s “actuators.”
1717

1818
- **Large Language Models**: While the notion of agents is not new, LLMs bring powerful language comprehension and data-processing capabilities to agent setups.
1919
- **Performing Actions**: Rather than just produce text, LLMs within an agent context interpret user instructions and interact with tools to achieve specific objectives.
2020
- **Tools**: The agent’s available toolkit depends on the software environment and developer-defined boundaries. In the travel agent example, these tools might be limited to flight and hotel reservation APIs.
21-
- **Knowledge**: Beyond immediate data sources, the agent can fetch additional details—perhaps from databases or web services—to enhance decision-making.
21+
- **Knowledge**: Beyond immediate data sources, the agent can fetch additional details—perhaps from databases or web services—to enhance decision making.
22+
2223

23-
---
2424

25-
## Varieties of AI Agents
25+
## Types of AI Agents
2626

27-
AI Agents come in multiple forms. The table below provides an overview of some agent types and examples illustrating their roles in a travel-booking system:
27+
AI Agents come in multiple forms. The table below provides an overview of some agent types and examples illustrating their roles in a travel booking system:
2828

29-
| **Agent Category** | **Key Characteristics** | **Example in Travel** |
29+
| **Agent Category** | **Key Characteristics** | **Example usage in a Travel system** |
3030
|--------------------------|--------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
3131
| **Simple Reflex Agents** | Act directly based on set rules or conditions. | Filters incoming messages and forwards travel-related emails to a service center. |
3232
| **Model-Based Agents** | Maintain an internal representation of the world and update it based on new inputs. | Monitors flight prices and flags dramatic fluctuations, guided by historical data. |
@@ -36,7 +36,6 @@ AI Agents come in multiple forms. The table below provides an overview of some a
3636
| **Hierarchical Agents** | Split tasks into sub-tasks and delegate smaller pieces of work to subordinate agents.| Cancels a trip by breaking down the process into individual steps, such as canceling a flight, a hotel, and a car rental. |
3737
| **Multi-Agent Systems** | Involve multiple agents that may cooperate or compete to complete tasks. | Cooperative: Different agents each manage flights, accommodations, and excursions. Competitive: Several agents vie for limited rooms. |
3838

39-
---
4039

4140
## Ideal Applications for AI Agents
4241

0 commit comments

Comments
 (0)