|
93 | 93 | |llama2-7b-aqlm-2bit-1x16|[AI-ModelScope/Llama-2-7b-AQLM-2Bit-1x16-hf](https://modelscope.cn/models/AI-ModelScope/Llama-2-7b-AQLM-2Bit-1x16-hf/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✘|transformers>=4.38, aqlm, torch>=2.2.0|-|[ISTA-DASLab/Llama-2-7b-AQLM-2Bit-1x16-hf](https://huggingface.co/ISTA-DASLab/Llama-2-7b-AQLM-2Bit-1x16-hf)| |
94 | 94 | |llama3-8b|[LLM-Research/Meta-Llama-3-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||-|[meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)| |
95 | 95 | |llama3-8b-instruct|[LLM-Research/Meta-Llama-3-8B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B-Instruct/summary)|q_proj, k_proj, v_proj|llama3|✔|✔||-|[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)| |
96 | | -|llama3-8b-instruct-int4|[huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int4](https://modelscope.cn/models/huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int4/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|-| |
97 | | -|llama3-8b-instruct-int8|[huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int8](https://modelscope.cn/models/huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int8/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|-| |
98 | | -|llama3-8b-instruct-awq|[huangjintao/Meta-Llama-3-8B-Instruct-AWQ](https://modelscope.cn/models/huangjintao/Meta-Llama-3-8B-Instruct-AWQ/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|autoawq|-|-| |
| 96 | +|llama3-8b-instruct-int4|[huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int4](https://modelscope.cn/models/huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int4/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|[study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int4](https://huggingface.co/study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int4)| |
| 97 | +|llama3-8b-instruct-int8|[huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int8](https://modelscope.cn/models/huangjintao/Meta-Llama-3-8B-Instruct-GPTQ-Int8/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|[study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int8](https://huggingface.co/study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int8)| |
| 98 | +|llama3-8b-instruct-awq|[huangjintao/Meta-Llama-3-8B-Instruct-AWQ](https://modelscope.cn/models/huangjintao/Meta-Llama-3-8B-Instruct-AWQ/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|autoawq|-|[study-hjt/Meta-Llama-3-8B-Instruct-AWQ](https://huggingface.co/study-hjt/Meta-Llama-3-8B-Instruct-AWQ)| |
99 | 99 | |llama3-70b|[LLM-Research/Meta-Llama-3-70B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-70B/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||-|[meta-llama/Meta-Llama-3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B)| |
100 | 100 | |llama3-70b-instruct|[LLM-Research/Meta-Llama-3-70B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-70B-Instruct/summary)|q_proj, k_proj, v_proj|llama3|✔|✔||-|[meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)| |
101 | | -|llama3-70b-instruct-int4|[huangjintao/Meta-Llama-3-70B-Instruct-GPTQ-Int4](https://modelscope.cn/models/huangjintao/Meta-Llama-3-70B-Instruct-GPTQ-Int4/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|-| |
102 | | -|llama3-70b-instruct-int8|[huangjintao/Meta-Llama-3-70b-Instruct-GPTQ-Int8](https://modelscope.cn/models/huangjintao/Meta-Llama-3-70b-Instruct-GPTQ-Int8/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|-| |
103 | | -|llama3-70b-instruct-awq|[huangjintao/Meta-Llama-3-70B-Instruct-AWQ](https://modelscope.cn/models/huangjintao/Meta-Llama-3-70B-Instruct-AWQ/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|autoawq|-|-| |
| 101 | +|llama3-70b-instruct-int4|[huangjintao/Meta-Llama-3-70B-Instruct-GPTQ-Int4](https://modelscope.cn/models/huangjintao/Meta-Llama-3-70B-Instruct-GPTQ-Int4/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|[study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int4](https://huggingface.co/study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int4)| |
| 102 | +|llama3-70b-instruct-int8|[huangjintao/Meta-Llama-3-70b-Instruct-GPTQ-Int8](https://modelscope.cn/models/huangjintao/Meta-Llama-3-70b-Instruct-GPTQ-Int8/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|auto_gptq|-|[study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int8](https://huggingface.co/study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int8)| |
| 103 | +|llama3-70b-instruct-awq|[huangjintao/Meta-Llama-3-70B-Instruct-AWQ](https://modelscope.cn/models/huangjintao/Meta-Llama-3-70B-Instruct-AWQ/summary)|q_proj, k_proj, v_proj|llama3|✔|✔|autoawq|-|[study-hjt/Meta-Llama-3-70B-Instruct-AWQ](https://huggingface.co/study-hjt/Meta-Llama-3-70B-Instruct-AWQ)| |
104 | 104 | |chinese-llama-2-1_3b|[AI-ModelScope/chinese-llama-2-1.3b](https://modelscope.cn/models/AI-ModelScope/chinese-llama-2-1.3b/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||-|[hfl/chinese-llama-2-1.3b](https://huggingface.co/hfl/chinese-llama-2-1.3b)| |
105 | 105 | |chinese-llama-2-7b|[AI-ModelScope/chinese-llama-2-7b](https://modelscope.cn/models/AI-ModelScope/chinese-llama-2-7b/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||-|[hfl/chinese-llama-2-7b](https://huggingface.co/hfl/chinese-llama-2-7b)| |
106 | 106 | |chinese-llama-2-7b-16k|[AI-ModelScope/chinese-llama-2-7b-16k](https://modelscope.cn/models/AI-ModelScope/chinese-llama-2-7b-16k/summary)|q_proj, k_proj, v_proj|default-generation|✔|✔||-|[hfl/chinese-llama-2-7b-16k](https://huggingface.co/hfl/chinese-llama-2-7b-16k)| |
|
243 | 243 | |codefuse-codegeex2-6b-chat|[codefuse-ai/CodeFuse-CodeGeeX2-6B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeGeeX2-6B/summary)|query_key_value|codefuse|✘|✔|transformers<4.34|coding|[codefuse-ai/CodeFuse-CodeGeeX2-6B](https://huggingface.co/codefuse-ai/CodeFuse-CodeGeeX2-6B)| |
244 | 244 | |codefuse-qwen-14b-chat|[codefuse-ai/CodeFuse-QWen-14B](https://modelscope.cn/models/codefuse-ai/CodeFuse-QWen-14B/summary)|c_attn|codefuse|✔|✔||coding|[codefuse-ai/CodeFuse-QWen-14B](https://huggingface.co/codefuse-ai/CodeFuse-QWen-14B)| |
245 | 245 | |phi2-3b|[AI-ModelScope/phi-2](https://modelscope.cn/models/AI-ModelScope/phi-2/summary)|Wqkv|default-generation|✔|✔||coding|[microsoft/phi-2](https://huggingface.co/microsoft/phi-2)| |
| 246 | +|phi3-4b-4k-instruct|[LLM-Research/Phi-3-mini-4k-instruct](https://modelscope.cn/models/LLM-Research/Phi-3-mini-4k-instruct/summary)|qkv_proj|phi3|✔|✘|transformers>=4.36|general|[microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)| |
| 247 | +|phi3-4b-128k-instruct|[LLM-Research/Phi-3-mini-128k-instruct](https://modelscope.cn/models/LLM-Research/Phi-3-mini-128k-instruct/summary)|qkv_proj|phi3|✔|✘|transformers>=4.36|general|[microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)| |
246 | 248 | |cogvlm-17b-instruct|[ZhipuAI/cogvlm-chat](https://modelscope.cn/models/ZhipuAI/cogvlm-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense|cogvlm-instruct|✘|✘||multi-modal, vision|[THUDM/cogvlm-chat-hf](https://huggingface.co/THUDM/cogvlm-chat-hf)| |
247 | 249 | |cogagent-18b-chat|[ZhipuAI/cogagent-chat](https://modelscope.cn/models/ZhipuAI/cogagent-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent-chat|✘|✘||multi-modal, vision|[THUDM/cogagent-chat-hf](https://huggingface.co/THUDM/cogagent-chat-hf)| |
248 | 250 | |cogagent-18b-instruct|[ZhipuAI/cogagent-vqa](https://modelscope.cn/models/ZhipuAI/cogagent-vqa/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent-instruct|✘|✘||multi-modal, vision|[THUDM/cogagent-vqa-hf](https://huggingface.co/THUDM/cogagent-vqa-hf)| |
|
291 | 293 | |open-orca-gpt4|[AI-ModelScope/OpenOrca](https://modelscope.cn/datasets/AI-ModelScope/OpenOrca/summary)|994896|0|382.3±417.4, min=31, max=8740|chat, multilingual, general|-| |
292 | 294 | |sharegpt-gpt4|[AI-ModelScope/sharegpt_gpt4](https://modelscope.cn/datasets/AI-ModelScope/sharegpt_gpt4/summary)|103063|0|1286.2±2089.4, min=22, max=221080|chat, multilingual, general, multi-round|-| |
293 | 295 | |🔥sharegpt-gpt4-mini|[AI-ModelScope/sharegpt_gpt4](https://modelscope.cn/datasets/AI-ModelScope/sharegpt_gpt4/summary)|6205|0|3511.6±6068.5, min=33, max=116018|chat, multilingual, general, multi-round, gpt4|-| |
| 296 | +|deepctrl-sft-zh|[AI-ModelScope/deepctrl-sft-data](https://modelscope.cn/datasets/AI-ModelScope/deepctrl-sft-data/summary)|11381621|0|274.2±311.5, min=21, max=9835|chat, general, sft, multi-round|-| |
| 297 | +|deepctrl-sft-en|[AI-ModelScope/deepctrl-sft-data](https://modelscope.cn/datasets/AI-ModelScope/deepctrl-sft-data/summary)|2767403|0|865.2±1157.6, min=22, max=626237|chat, general, sft, multi-round|-| |
294 | 298 | |🔥ms-agent|[iic/ms_agent](https://modelscope.cn/datasets/iic/ms_agent/summary)|30000|0|647.7±217.1, min=199, max=2722|chat, agent, multi-round|-| |
295 | 299 | |ms-agent-for-agentfabric-default|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|30000|0|617.8±199.1, min=251, max=2657|chat, agent, multi-round|-| |
296 | 300 | |ms-agent-for-agentfabric-addition|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|488|0|2084.9±1514.8, min=489, max=7354|chat, agent, multi-round|-| |
|
0 commit comments