@@ -10,9 +10,57 @@ import DeepseekIcon from '@site/static/img/icons/models/Deepseek-logo-icon.svg';
1010import GPTIcon from ' @site/static/img/icons/models/GPT-logo.svg' ;
1111import QwenIcon from ' @site/static/img/icons/models/Qwen_logo.svg' ;
1212import ZaiIcon from ' @site/static/img/icons/models/zai-logo.svg' ;
13+ import CodeBlock from ' @theme/CodeBlock' ;
1314
14- NEAR AI Cloud offers a curated catalog of high-performance models spanning reasoning, tool use,
15- and long-context understanding. Pricing is listed per million tokens for easy comparison.
15+ NEAR AI Cloud provides access to leading AI models, each optimized for different use cases — from advanced reasoning and tool calling to long-context processing and multilingual tasks. All models run in secure TEE environments with transparent, pay-per-use pricing.
16+
17+ ## Quick Reference
18+
19+ <table >
20+ <thead >
21+ <tr >
22+ <th >Model ID</th >
23+ <th >Context</th >
24+ <th >Input Price</th >
25+ <th >Output Price</th >
26+ <th >Best For</th >
27+ </tr >
28+ </thead >
29+ <tbody >
30+ <tr >
31+ <td ><CodeBlock language = " text" >deepseek-ai/DeepSeek-V3.1</CodeBlock ></td >
32+ <td >128K</td >
33+ <td >$1.00/M</td >
34+ <td >$2.50/M</td >
35+ <td >Hybrid thinking mode, tool calling, agent tasks</td >
36+ </tr >
37+ <tr >
38+ <td ><CodeBlock language = " text" >openai/gpt-oss-120b</CodeBlock ></td >
39+ <td >131K</td >
40+ <td >$0.20/M</td >
41+ <td >$0.60/M</td >
42+ <td >Open-weight, high-reasoning, agentic workflows, configurable depth</td >
43+ </tr >
44+ <tr >
45+ <td ><CodeBlock language = " text" >Qwen/Qwen3-30B-A3B-Instruct-2507</CodeBlock ></td >
46+ <td >262K</td >
47+ <td >$0.15/M</td >
48+ <td >$0.45/M</td >
49+ <td >Ultra-long context (262K), reasoning, instruction following, multilingual</td >
50+ </tr >
51+ <tr >
52+ <td ><CodeBlock language = " text" >zai-org/GLM-4.6-FP8</CodeBlock ></td >
53+ <td >200K</td >
54+ <td >$0.75/M</td >
55+ <td >$2.00/M</td >
56+ <td >Agentic applications, advanced coding, tool use, refined writing</td >
57+ </tr >
58+ </tbody >
59+ </table >
60+
61+ ---
62+
63+ ## Model Details
1664
1765<div className = " doc-model-grid" >
1866 <div className = " doc-model-card" >
@@ -22,7 +70,6 @@ and long-context understanding. Pricing is listed per million tokens for easy co
2270 </div >
2371 <div >
2472 <h3 >DeepSeek V3.1</h3 >
25- <p className = " doc-model-provider" >deepseek-ai/DeepSeek-V3.1</p >
2673 </div >
2774 </div >
2875 <p >
@@ -39,6 +86,8 @@ and long-context understanding. Pricing is listed per million tokens for easy co
3986 <span >$1.00 /M input tokens</span >
4087 <span >$2.50 /M output tokens</span >
4188 </div >
89+ <p ><strong >Model ID:</strong ></p >
90+ <CodeBlock language = " text" >deepseek-ai/DeepSeek-V3.1</CodeBlock >
4291 </div >
4392
4493 <div className = " doc-model-card" >
@@ -48,23 +97,24 @@ and long-context understanding. Pricing is listed per million tokens for easy co
4897 </div >
4998 <div >
5099 <h3 >GPT OSS 120B</h3 >
51- <p className = " doc-model-provider" >openai/gpt-oss-120b</p >
52100 </div >
53101 </div >
54102 <p >
55- GPT OSS 120B is OpenAI & rsquo ; s 117B-parameter Mixture-of-Experts model for production reasoning and
56- agentic workflows. It activates just 5.1B parameters per pass, runs efficiently on a single H100
57- via native MXFP4 quantization, and supports configurable reasoning depth .
103+ GPT OSS 120B is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI
104+ designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B
105+ parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization .
58106 </p >
59107 <p >
60- You get full chain-of-thought visibility, native tool use (function calling, browsing, structured
61- outputs), and high reliability for complex pipelines .
108+ The model supports configurable reasoning depth, full chain-of-thought access, and native tool use,
109+ including function calling, browsing, and structured output generation .
62110 </p >
63111 <div className = " doc-model-meta" >
64112 <span >131K context</span >
65113 <span >$0.20 /M input tokens</span >
66114 <span >$0.60 /M output tokens</span >
67115 </div >
116+ <p ><strong >Model ID:</strong ></p >
117+ <CodeBlock language = " text" >openai/gpt-oss-120b</CodeBlock >
68118 </div >
69119
70120 <div className = " doc-model-card" >
@@ -74,23 +124,22 @@ and long-context understanding. Pricing is listed per million tokens for easy co
74124 </div >
75125 <div >
76126 <h3 >Qwen3 30B A3B Instruct 2507</h3 >
77- <p className = " doc-model-provider" >Qwen/Qwen3-30B-A3B-Instruct-2507</p >
78127 </div >
79128 </div >
80129 <p >
81- Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter MoE model (3.3B active per inference) with an
82- ultra-long 262K context window. It excels at instruction following, logical reasoning, coding,
83- mathematics, multilingual tasks, and preference alignment&mdash ; all in non-thinking mode.
84- </p >
85- <p >
86- Use it when you need multilingual comprehension and strong instruction adherence without the
87- overhead of a full reasoning model.
130+ Qwen3-30B-A3B-Instruct-2507 is a mixture-of-experts (MoE) causal language model featuring 30.5 billion
131+ total parameters and 3.3 billion activated parameters per inference. It supports ultra-long context up
132+ to 262K tokens and operates exclusively in non-thinking mode, delivering strong enhancements in
133+ instruction following, reasoning, logical comprehension, mathematics, coding, multilingual understanding,
134+ and alignment with user preferences.
88135 </p >
89136 <div className = " doc-model-meta" >
90137 <span >262K context</span >
91138 <span >$0.15 /M input tokens</span >
92139 <span >$0.45 /M output tokens</span >
93140 </div >
141+ <p ><strong >Model ID:</strong ></p >
142+ <CodeBlock language = " text" >Qwen/Qwen3-30B-A3B-Instruct-2507</CodeBlock >
94143 </div >
95144
96145 <div className = " doc-model-card" >
@@ -100,22 +149,25 @@ and long-context understanding. Pricing is listed per million tokens for easy co
100149 </div >
101150 <div >
102151 <h3 >GLM-4.6 FP8</h3 >
103- <p className = " doc-model-provider" >zai-org/GLM-4.6-FP8</p >
104152 </div >
105153 </div >
106154 <p >
107- GLM-4.6 FP8 from Zhipu AI packs 358B parameters into an FP8-quantized deployment with a 128K
108- context window. It shines in advanced coding, multi-step reasoning, and tool calling while
109- boosting token efficiency by up to 15% versus GLM-4.5 .
155+ GLM-4.6 is the latest flagship model in the GLM (General Language Model) series by Z.ai (formerly Zhipu AI).
156+ It is oriented toward agentic applications: reasoning, tool usage, coding/engineering workflows, and long-context tasks.
157+ The FP8 quantized version maintains full performance while optimizing for efficient deployment .
110158 </p >
111159 <p >
112- Positioned as a competitor to Claude Sonnet 4 and DeepSeek-V3.1-Terminus, it delivers premium
113- writing quality and robust agentic workflow support for production environments.
160+ Compared with GLM-4.5, GLM-4.6 brings several key improvements: a longer 200K context window (expanded from 128K),
161+ superior coding performance with better real-world results in applications like Claude Code and Cline, advanced
162+ reasoning with tool use during inference, more capable search-based agents, and refined writing that better aligns
163+ with human preferences in style and readability.
114164 </p >
115165 <div className = " doc-model-meta" >
116- <span >131K context</span >
166+ <span >200K context</span >
117167 <span >$0.75 /M input tokens</span >
118168 <span >$2.00 /M output tokens</span >
119169 </div >
170+ <p ><strong >Model ID:</strong ></p >
171+ <CodeBlock language = " text" >zai-org/GLM-4.6-FP8</CodeBlock >
120172 </div >
121173</div >
0 commit comments