Skip to content

Commit aa3c4c3

Browse files
committed
fix: all previous-model output multipliers → 3 (gpt-4o-mini, o3-mini, o1)
gpt-4o-mini: verified via official MS docs example table on latency page (RPM=1000, prompt=5000, completion=50 → 140 PTUs only works with 3×) o3-mini, o1: same 'previous model' category; docs state 'older models use a different ratio' — aligned with verified models. gpt-4o was already fixed in prior commit (verified via Foundry calculator).
1 parent 9047ad5 commit aa3c4c3

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

src/azure_pricing_mcp/services/ptu_models.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,10 @@
5858
# - gpt-5 family: explicitly documented as 8× (1 output = 8 input tokens)
5959
# - gpt-4.1 family: explicitly documented as 4× (1 output = 4 input tokens)
6060
# - Llama-3.3-70B-Instruct: explicitly documented as 4× (exception to pricing ratio)
61-
# - Older models / others: inferred from pricing ratios where not explicitly stated
61+
# - Previous Azure OpenAI models (gpt-4o, gpt-4o-mini): 3× (verified via
62+
# Foundry calculator and official MS docs example tables)
63+
# - o3-mini, o1: assumed 3× (same "previous model" category; docs say
64+
# "older models use a different ratio" without specifying)
6265
# ---------------------------------------------------------------------------
6366

6467
PTU_MODEL_TABLE: dict[str, dict] = {
@@ -162,23 +165,23 @@
162165
},
163166
"gpt-4o-mini": {
164167
"input_tpm_per_ptu": 37_000,
165-
"output_multiplier": 4,
168+
"output_multiplier": 3, # Verified via official MS docs example table (latency page)
166169
"global_min_ptus": 15,
167170
"global_increment": 5,
168171
"regional_min_ptus": 25,
169172
"regional_increment": 25,
170173
},
171174
"o3-mini": {
172175
"input_tpm_per_ptu": 2_500,
173-
"output_multiplier": 4,
176+
"output_multiplier": 3, # Previous model; docs: "older models use a different ratio"
174177
"global_min_ptus": 15,
175178
"global_increment": 5,
176179
"regional_min_ptus": 25,
177180
"regional_increment": 25,
178181
},
179182
"o1": {
180183
"input_tpm_per_ptu": 230,
181-
"output_multiplier": 4,
184+
"output_multiplier": 3, # Previous model; docs: "older models use a different ratio"
182185
"global_min_ptus": 15,
183186
"global_increment": 5,
184187
"regional_min_ptus": 25,

0 commit comments

Comments
 (0)