Skip to content

Commit 83301f3

Browse files
authored
Merge pull request #29 from msftnadavbh/fix/ptu-multiplier-and-docs
fix: correct output multipliers for all previous Azure OpenAI models + docs 7d→30d
2 parents 188a4fb + aa3c4c3 commit 83301f3

File tree

2 files changed

+14
-11
lines changed

2 files changed

+14
-11
lines changed

docs/USAGE_EXAMPLES.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -439,24 +439,24 @@ The tool needs three required inputs: **RPM** (requests per minute), **avg input
439439

440440
#### Option A — Azure CLI (no Log Analytics)
441441

442-
Copy-paste this script — it queries the last 7 days and prints your three inputs:
442+
Copy-paste this script — it queries the last 30 days and prints your three inputs:
443443

444444
```bash
445445
# Replace {sub}, {rg}, {name} with your values
446446
RES="/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{name}"
447-
START=$(date -u -d "7 days ago" +%Y-%m-%dT%H:%M:%SZ)
447+
START=$(date -u -d "30 days ago" +%Y-%m-%dT%H:%M:%SZ)
448448
END=$(date -u +%Y-%m-%dT%H:%M:%SZ)
449449

450450
REQS=$(az monitor metrics list --resource "$RES" --metric AzureOpenAIRequests \
451-
--aggregation Total --interval P7D --start-time "$START" --end-time "$END" \
451+
--aggregation Total --interval P30D --start-time "$START" --end-time "$END" \
452452
--query "value[0].timeseries[0].data[0].total" -o tsv)
453453

454454
INPUT=$(az monitor metrics list --resource "$RES" --metric ProcessedPromptTokens \
455-
--aggregation Total --interval P7D --start-time "$START" --end-time "$END" \
455+
--aggregation Total --interval P30D --start-time "$START" --end-time "$END" \
456456
--query "value[0].timeseries[0].data[0].total" -o tsv)
457457

458458
OUTPUT=$(az monitor metrics list --resource "$RES" --metric GeneratedTokens \
459-
--aggregation Total --interval P7D --start-time "$START" --end-time "$END" \
459+
--aggregation Total --interval P30D --start-time "$START" --end-time "$END" \
460460
--query "value[0].timeseries[0].data[0].total" -o tsv)
461461

462462
PEAK=$(az monitor metrics list --resource "$RES" --metric AzureOpenAIRequests \
@@ -476,7 +476,7 @@ Enable diagnostic settings on your OpenAI resource → send to Log Analytics, th
476476
```kql
477477
AzureMetrics
478478
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
479-
| where TimeGenerated >= ago(7d)
479+
| where TimeGenerated >= ago(30d)
480480
| summarize
481481
TotalRequests = sumif(Total, MetricName == "AzureOpenAIRequests"),
482482
TotalInputTok = sumif(Total, MetricName == "ProcessedPromptTokens"),

src/azure_pricing_mcp/services/ptu_models.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,10 @@
5858
# - gpt-5 family: explicitly documented as 8× (1 output = 8 input tokens)
5959
# - gpt-4.1 family: explicitly documented as 4× (1 output = 4 input tokens)
6060
# - Llama-3.3-70B-Instruct: explicitly documented as 4× (exception to pricing ratio)
61-
# - Older models / others: inferred from pricing ratios where not explicitly stated
61+
# - Previous Azure OpenAI models (gpt-4o, gpt-4o-mini): 3× (verified via
62+
# Foundry calculator and official MS docs example tables)
63+
# - o3-mini, o1: assumed 3× (same "previous model" category; docs say
64+
# "older models use a different ratio" without specifying)
6265
# ---------------------------------------------------------------------------
6366

6467
PTU_MODEL_TABLE: dict[str, dict] = {
@@ -154,31 +157,31 @@
154157
# ── Previous Azure OpenAI models ────────────────────────────────────
155158
"gpt-4o": {
156159
"input_tpm_per_ptu": 2_500,
157-
"output_multiplier": 4,
160+
"output_multiplier": 3, # Verified via Foundry calculator; older model, different ratio
158161
"global_min_ptus": 15,
159162
"global_increment": 5,
160163
"regional_min_ptus": 50,
161164
"regional_increment": 50,
162165
},
163166
"gpt-4o-mini": {
164167
"input_tpm_per_ptu": 37_000,
165-
"output_multiplier": 4,
168+
"output_multiplier": 3, # Verified via official MS docs example table (latency page)
166169
"global_min_ptus": 15,
167170
"global_increment": 5,
168171
"regional_min_ptus": 25,
169172
"regional_increment": 25,
170173
},
171174
"o3-mini": {
172175
"input_tpm_per_ptu": 2_500,
173-
"output_multiplier": 4,
176+
"output_multiplier": 3, # Previous model; docs: "older models use a different ratio"
174177
"global_min_ptus": 15,
175178
"global_increment": 5,
176179
"regional_min_ptus": 25,
177180
"regional_increment": 25,
178181
},
179182
"o1": {
180183
"input_tpm_per_ptu": 230,
181-
"output_multiplier": 4,
184+
"output_multiplier": 3, # Previous model; docs: "older models use a different ratio"
182185
"global_min_ptus": 15,
183186
"global_increment": 5,
184187
"regional_min_ptus": 25,

0 commit comments

Comments
 (0)