📝

jiangyangcreate · jiangyangcreate · commit f76291b043b3 · 2024-11-28T16:20:10.000+08:00
diff --git a/docs/docs/机器学习/大语言模型部署/Agent智能体.md b/docs/docs/机器学习/大语言模型部署/Agent智能体.md
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 3
+sidebar_position: 6
 title: 🚧Agent智能体
 ---
 
diff --git a/docs/docs/机器学习/大语言模型部署/大语言模型获取.md b/docs/docs/机器学习/大语言模型部署/大语言模型获取.md
diff --git a/docs/docs/机器学习/大语言模型部署/提示词工程.md b/docs/docs/机器学习/大语言模型部署/提示词工程.md
@@ -0,0 +1,12 @@
+---
+sidebar_position: 3
+title: 🚧提示词工程
+---
+
+
+## 提示词工程
+
+提示工程（Prompt Engineering）是一门较新的学科，关注提示词开发和优化，帮助用户将大语言模型（Large Language Model, LLM）用于各场景和研究领域。 掌握了提示工程相关技能将有助于用户更好地了解大型语言模型的能力和局限性。
+
+推荐文档：[https://www.promptingguide.ai/zh](https://www.promptingguide.ai/zh)
+
diff --git a/docs/docs/机器学习/大语言模型部署/模型微调.md b/docs/docs/机器学习/大语言模型部署/模型微调.md
@@ -0,0 +1,25 @@
+---
+sidebar_position: 4
+title: 🚧模型微调
+---
+
+
+模型微调（Fine-Tuning） 是指在一个预训练的基础模型上，使用特定领域或特定任务的数据进行进一步训练，以使模型能够在特定任务上表现得更好。例如对计算机科学的名词翻译进行微调，可以提高翻译的准确性。
+
+## LLaMA-Factory
+
+项目地址：[https://github.com/hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
+
+安装
+
+```bash
+git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
+cd LLaMA-Factory
+pip install -e ".[torch,metrics]"
+```
+
+启动WebUI
+
+```bash
+llamafactory-cli webui
+```
diff --git a/docs/docs/机器学习/大语言模型部署/模型获取.md b/docs/docs/机器学习/大语言模型部署/模型获取.md
@@ -0,0 +1,112 @@
+---
+sidebar_position: 1
+title: 🚧模型获取
+---
+
+## 开源社区
+
+大模型社区是指围绕大型深度学习模型（如 GPT 系列、BERT、T5 等）构建的开放协作平台和生态系统。这些社区由研究人员、开发者、数据科学家、工程师及爱好者组成，他们共同致力于大模型的研究、开发、优化和应用。
+
+现在模型非常多，各有千秋，且更新迭代非常快。下面的表格列出了部分公司及其z主要大模型代号：
+
+| **公司名称**                  | **大模型代号**      |
+| ----------------------------- | ------------------- |
+| **OpenAI**                    | GPT                 |
+| **Meta**                      | Llama               |
+| **Anthropic(前 OpenAI 成员)** | Claude              |
+| **X**                         | Grok                |
+| **谷歌**                      | Gemini              |
+| **微软**                      | Phi                 |
+| **百度**                      | 文心大模型 (Ernie)  |
+| **阿里巴巴**                  | 通义千问 (Qwen), M6 |
+| **腾讯**                      | 混元 (Hunyuan)      |
+| **字节跳动**                  | 豆包                |
+| **华为**                      | 盘古大模型 (Pangu)  |
+
+社区具有明显的马太效应，即头部效应明显，头部模型拥有最多的资源，最新的技术，最多的用户。这里列举两个在国内外有一定影响力的社区。
+
+### Hugging Face
+
+社区地址：[https://huggingface.co/](https://huggingface.co/)
+
+以 Qwen 模型为例，下面展示如何使用 Hugging Face 的 transformers 库进行推理。其中`model_name`为模型地址
+
+```python showLineNumbers
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_size = "3B"  # 3B 7B 14B 32B
+model_name = f"Qwen/Qwen2.5-{model_size}-Instruct"
+
+model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+
+while True:
+    prompt = input("输入你的问题: ")
+    if prompt == "退出":
+        break
+
+    messages = [
+        {
+            "role": "system",
+            "content": "你是一个AI助手，由阿里巴巴云创建。你是一个乐于助人的助手。你总是以中文回答问题。",
+        },
+        {"role": "user", "content": prompt},
+    ]
+    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    model_input = tokenizer([text], return_tensors="pt").to(model.device)
+
+    generated_ids = model.generate(**model_input, max_new_tokens=512)
+    generated_ids = [output[len(input_ids):] for input_ids, output in zip(model_input.input_ids, generated_ids)]
+
+    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+    print(response)
+```
+
+### 魔搭社区（阿里达摩院）
+
+社区地址：[https://www.modelscope.cn/](https://www.modelscope.cn/)
+
+除了 Hugging Face 的 transformers 库，魔搭社区还提供了 modelscope 库，基于中国网络环境，可以方便地进行推理。代码基本与 Hugging Face 一致。
+
+```python showLineNumbers
+from modelscope import AutoModelForCausalLM, AutoTokenizer
+
+model_size = "0.5B"  # 3B 7B 14B 32B
+model_name = f"Qwen/Qwen2.5-{model_size}-Instruct"
+
+model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+
+while True:
+    prompt = input("输入你的问题: ")
+    if prompt == "退出":
+        break
+
+    messages = [
+        {
+            "role": "system",
+            "content": "你是一个AI助手，由阿里巴巴云创建。你是一个乐于助人的助手。你总是以中文回答问题。",
+        },
+        {"role": "user", "content": prompt},
+    ]
+    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    model_input = tokenizer([text], return_tensors="pt").to(model.device)
+
+    generated_ids = model.generate(**model_input, max_new_tokens=512)
+    generated_ids = [output[len(input_ids):] for input_ids, output in zip(model_input.input_ids, generated_ids)]
+
+    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+    print(response)
+```
+
+## 商用接口
+
+接口大同小异，这里列举一个国内的接口与一个国外的接口用作示例。
+
+### OpenAI
+
+地址：[https://openai.com/](https://openai.com/)
+
+### 百度
+
+地址：[https://cloud.baidu.com/](https://cloud.baidu.com/)
diff --git a/docs/docs/机器学习/大语言模型部署/模型部署.md b/docs/docs/机器学习/大语言模型部署/模型部署.md
@@ -1,20 +1,10 @@
 ---
-sidebar_position: 1
-title: 🚧模型微调与部署
+sidebar_position: 5
+title: 🚧模型部署
 ---
 
-## 模型微调
-
-### LLaMA-Factory
-
-## 提示词工程
-
-### Prompt Engineering Guide
-
-[https://www.promptingguide.ai/zh](https://www.promptingguide.ai/zh)
-
-## 模型部署
 
+模型部署主要有如下几个需求：并发高、延迟低、占用小。解决方案对应Ollama、VLLM、Llama-Cpp-Python。
 
 | 维度                   | Ollama                                             | VLLM                                            | Llama-Cpp-Python                                |
 |------------------------|---------------------------------------------------|------------------------------------------------|------------------------------------------------|
@@ -35,17 +25,17 @@ title: 🚧模型微调与部署
 | **成熟度**             | 新兴工具，功能逐步完善                            | 工业级项目，专注高性能推理                      | 成熟项目，广泛使用于轻量化 LLM 应用             |
 
 
-### ollma
+## ollma
 
 github地址：https://github.com/ollama/ollama
 
 官网：https://ollama.com/
 
-### vllma
+## vllma
 
 
 官网：https://docs.vllm.ai/en/latest/getting_started/quickstart.html
 
 
-### llama-cpp-python
+## llama-cpp-python