Skip to content

Commit c3e4ed9

Browse files
authored
Qualcomm AI Engine Direct - Improve GA Qwen 2.5 (#14047)
## Summary: - Modification for default export setting with 16a4w_block Qwen 2.5 0.5B: Quant Config: 16a8w->16a4w_block; PPL 12.05 -> 13.81; Token Rate: 131 -> 164 Qwen 2.5 1.5B: Quant Config: 16a8w->16a4w_block; PPL 9.33 -> 9.83; Token Rate: 34 -> 50 ## Commands #### QWEN2.5 0.5B Default example using hybrid mode ```bash python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s ${SERIAL_NUM} -m ${SOC_MODEL} --temperature 0 --model_mode hybrid --max_seq_len 1024 --prefill_ar_len 128 --decoder_model qwen2_5-0_5b --prompt "I would like to learn python, could you teach me with a simple example?" --tasks wikitext --limit 1 ``` #### QWEN2.5 1.5B Default example using kv mode ```bash python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s ${SERIAL_NUM} -m ${SOC_MODEL} --temperature 0 --model_mode kv --max_seq_len 1024 --decoder_model qwen2_5-1_5b --prompt "I would like to learn python, could you teach me with a simple example?" --tasks wikitext --limit 1 ``` ## Test Results: ### Qwen 2.5 0.5B #### prompt = "I would like to learn python, could you teach me with one simple program?" ~~~ I would like to learn python, could you teach me with one simple program? I know how to use the print function but I don't know how to use the input function. Can you teach me with a simple program? Sure! Let's create a simple program that asks the user for their name and then prints a greeting message. Here's a step-by-step guide on how to do it: 1. First, we need to import the `input()` function from the `sys` module. This will allow us to use the input function without having to worry about the syntax. ```python import sys ``` 2. Then, we can use the `print()` function to display the greeting message. The greeting message is enclosed in curly braces `{}` and the name is enclosed in single quotes `''`. The `print()` function takes two arguments: the message to be displayed and the name to be printed. ```python print("Hello, " + input("What is your name? ")) ``` 3. Finally, we can run the program by simply pressing the "Run" button in the Python interpreter. Now, let's see what the output will be: ``` Hello, John Doe ``` This is a simple program that asks the user for their name and then prints a greeting message. You can run this program and see the output. Feel free to ask if you have any questions or if you want to learn more about Python!<|endoftext|> ~~~ #### prompt = "请你替我产生一段简单的C++程式码,并从中解释物件导向的概念" ~~~ 请你替我产生一段简单的C++程式码,并从中解释物件导向的概念。 当然可以!以下是一个简单的C++程序示例,展示了如何使用对象导向(OOP)来创建和管理对象: ```cpp #include <iostream> #include <string> class Person { std::string name; int age; double salary; Person(const std::string& name, int age, double salary) { this->name = name; this->age = age; this->salary = salary; } void displayInfo() { std::cout << "Name: " << name << ", Age: " << age << ", Salary: $" << salary << std::endl; } }; int main() { Person person("Alice", 30, 50000.50); person.displayInfo(); return 0; } ``` 在这个示例中,我们定义了一个名为 `Person` 的类,它包含一个 `name`、`age` 和 `salary` 的成员变量。我们还定义了一个 `displayInfo` 方法,用于显示对象的信息。最后,我们创建了一个 `main` 函数,调用 `Person` 类的实例,并调用 `displayInfo` 方法来显示对象的信息。 ### 详细解释 1. **定义类**: - `class Person`:定义了一个名为 `Person` 的类,它包含三个成员变量:`name`、`age` 和 `salary`。 - `const`:`const` 是一个关键字,表示成员变量不能被修改,这使得 `Person` 类成为了一个常量类,即一个对象的值不会随时间改变。 2. **定义成员变量**: - `std::string name`:定义了一个名为 `name` 的成员变量,用于存储对象的名称。 - `int age`:定义了一个名为 `age` 的成员变量,用于存储对象的年龄。 - `double salary`:定义了一个名为 `salary` 的成员变量,用于存储对象的工资。 3. **定义成员函数**: - `void displayInfo()`:定义了一个名为 `displayInfo` 的成员函数,用于显示对象的信息。这个函数接受一个参数 `const Person&`,表示传递的是对象的引用,而不是对象本身。 4. **定义主函数**: - `int main()`:定义了一个名为 `main` 的主函数,它调用 `Person` 类的实例,并调用 `displayInfo` 方法来显示对象的信息。 5. **调用主函数**: - `Person person("Alice", 30, 50000.50)`:定义了一个名为 `person` 的对象,它包含 `name`、`age` 和 `salary` 的成员变量。 - `person.displayInfo()`:调用 `person` 类的实例,并调用 `displayInfo` 方法来显示对象的信息。 通过这种方式,我们展示了如何使用对象导向(OOP)来创建和管理对象。在实际应用中,我们还可以进一步扩展这个示例,例如添加更多的成员变量,实现更复杂的对象行为,以及处理异常情况等。<|endoftext|> ~~~ ### Qwen 2.5 1.5B #### prompt = "I would like to learn python, could you teach me with one simple program?" ~~~ I would like to learn python, could you teach me with one simple program? Sure, I'd be happy to help you learn Python. Let's start with a simple program that prints "Hello, World!" to the console. This will give you a basic understanding of how to write and run Python code. Here's the code: ```python print("Hello, World!") ``` To run this code, you need to have Python installed on your computer. You can download it from the official website: https://www.python.org/downloads/ Once you have Python installed, you can save this code in a file with a `.py` extension, for example, `hello_world.py`. Then, you can run the file using the command prompt or terminal: ```bash python hello_world.py ``` This will execute the code and print "Hello, World!" to the console. If you want to learn more about Python, there are many resources available online. Some popular ones include: - [Python.org](https://www.python.org/) - [Codecademy](https://www.codecademy.com/learn/python) - [Khan Academy](https://www.khanacademy.org/learn/programming) - [W3Schools](https://www.w3schools.com/python/) I hope this helps you get started with Python!<|endoftext|> ~~~ #### prompt = "请你替我产生一段简单的C++程式码,并从中解释物件导向的概念" ~~~ 请你替我产生一段简单的C++程式码,并从中解释物件导向的概念。 当然可以!以下是一个简单的C++程序,展示了如何使用对象导向编程(Object-Oriented Programming, OOP)来创建和使用类(class)和对象(object)。 ```cpp #include <iostream> #include <string> class Person { std::string name; int age; public: // 构造函数 Person() : name("Unknown"), age(0) {} // 构造函数 Person(const std::string& n, int a) : name(n), age(a) {} // 设置姓名 void setName(const std::string& n) { name = n; } // 获取姓名 std::string getName() const { return name; } // 设置年龄 void setAge(int a) { age = a; } // 获取年龄 int getAge() const { return age; } // 打印个人信息 void printInfo() const { std::cout << "Name: " << getName() << ", Age: " << getAge() << std::endl; } }; int main() { // 创建Person对象 Person person1("Alice", 25); Person person2("Bob", 30); // 打印Person1的信息 person1.printInfo(); // 修改Person1的年龄 person1.setAge(26); // 打印修改后的Person1的信息 person1.printInfo(); // 创建Person2对象 Person person3("Charlie", 35); // 打印Person2的信息 person3.printInfo(); return 0; } ``` ### 对象导向的概念解释 1. **类(Class)**:类是对象的蓝图或模板,它定义了对象的属性和行为。类可以包含多个属性(如`name`和`age`)和方法(如`setName`、`setAge`、`printInfo`等)。 2. **对象(Object)**:对象是类的具体实例,它包含了类定义的所有属性和方法。每个对象都有自己的状态和行为。 3. **封装(Encapsulation)**:封装是OOP的核心特性之一,它将数据(属性)和操作数据的方法(方法)绑定在一起,使得数据和操作数据的方法不能直接访问,从而保护了数据的隐私和安全性。 4. **继承(Inheritance)**:继承允许一个类继承另一个类的属性和方法,从而实现代码的重用和扩展。子类可以增加父类的属性和方法,也可以修改父类的属性和方法。 5. **多态(Polymorphism)**:多态允许子类重写父类的方法,从而在不同的对象上使用相同的代码。这使得代码更加灵活和可维护。 通过使用类和对象,我们可以更好地组织和管理代码,提高代码的可读性、可维护性、可扩展性以及可重用性。<|endoftext|> ~~~
1 parent 16aac24 commit c3e4ed9

File tree

2 files changed

+7
-7
lines changed

2 files changed

+7
-7
lines changed

backends/qualcomm/tests/test_qnn_delegate.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4979,9 +4979,9 @@ def test_static_qwen2_5(self):
49794979
if "Error" in msg:
49804980
self.fail(msg["Error"])
49814981
else:
4982-
inference_speed_ref = {"SM8650": 110, "SM8750": 130}
4982+
inference_speed_ref = {"SM8650": 115, "SM8750": 155}
49834983
self.assertLessEqual(msg["wiki_ppl"], 15)
4984-
self.assertLessEqual(msg["pte_size"], 800000000) # 800mb
4984+
self.assertLessEqual(msg["pte_size"], 600000000) # 600mb
49854985
if self.model in inference_speed_ref:
49864986
self.assertGreaterEqual(
49874987
msg["inference_speed"], inference_speed_ref[self.model]

examples/qualcomm/oss_scripts/llama/__init__.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -211,8 +211,8 @@ class Qwen2_5_0_5B(LLMModelConfig):
211211

212212
num_sharding = 1
213213
# quant config
214-
ptq = QuantDtype.use_16a8w
215-
group_size = None
214+
ptq = QuantDtype.use_16a4w_block
215+
group_size = 16
216216
masked_softmax = True
217217
r1 = False
218218
r2 = False
@@ -233,13 +233,13 @@ class Qwen2_5_1_5B(LLMModelConfig):
233233

234234
num_sharding = 1
235235
# quant config
236-
ptq = QuantDtype.use_16a8w
237-
group_size = None
236+
ptq = QuantDtype.use_16a4w_block
237+
group_size = 16
238238
masked_softmax = True
239239
r1 = False
240240
r2 = False
241241
r3 = True
242-
custom_annotation = ()
242+
custom_annotation = (annotate_output_16a8w,)
243243

244244

245245
@register_llm_model("qwen3-0_6b")

0 commit comments

Comments
 (0)