-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
Bug Report: JsonCTemplatePolicy causes template size inflation leading to 4MB limit errors
Summary
The JsonCTemplatePolicy in Azure CLI has a critical bug that causes ARM templates to be inflated in size due to improper handling of escaped JSON strings. This can cause templates that are under 4MB to fail with "template too large" errors when deployed via az deployment group create.
Environment
- Azure CLI Version: [Current main branch]
- Operating System: Linux/Windows/macOS (affects all platforms)
- Command:
az deployment group create --template-file <large-template.json>
Steps to Reproduce
- Create an ARM template that is close to but under 4MB in size (e.g., 3.8MB)
- Run:
az deployment group create --resource-group <rg> --template-file <template> --debug - Observe the error: "The template size exceeds the maximum allowed size of 4MB"
- Verify that the same template works when calling the REST API directly
Expected Behavior
- Templates under 4MB should deploy successfully via Azure CLI
- The request payload size should match the original template size
--debuglogs should show accurate request sizes
Actual Behavior
- Templates under 4MB fail with "template too large" errors
- The actual HTTP request contains escaped JSON that inflates the template size
- Direct REST API calls with the same template succeed
Root Cause Analysis
Location of Bug
File: src/azure-cli/azure/cli/command_modules/resource/custom.py
Class: JsonCTemplatePolicy
Method: on_request()
Lines: ~468-481
The Problem
The policy extracts the template from the serialized request data but fails to unescape it:
# Line 468: template contains ESCAPED JSON string from SDK serialization
template = modified_data["properties"]["template"]
# Line 481: Escaped string is concatenated directly (BUG!)
json_data = partial_request[:-2] + ", template:" + template + r"}}"What Should Happen vs What Actually Happens
Expected flow:
- Template content (JSON string) → SDK serializes → escaped JSON string
- Policy extracts escaped string → unescapes it → concatenates unescaped JSON
- Final payload contains original-size template
Actual (buggy) flow:
- Template content (JSON string) → SDK serializes → escaped JSON string
- Policy extracts escaped string → uses it directly → concatenates escaped JSON
- Final payload contains inflated template with escape characters
Size Impact Example
Original template snippet:
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0"
}After SDK serialization (what template variable contains):
"{\n \"$schema\": \"https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#\",\n \"contentVersion\": \"1.0.0.0\"\n}"Size inflation:
- Every
"becomes\"(+1 character each) - Every newline becomes
\n(+1 character each) - Every
\becomes\\(+1 character each)
For large templates, this can easily add 10-20% to the payload size.
Evidence
Code Comments Confirm the Issue
The existing code comments actually acknowledge this problem:
# Line 485-487 in JsonCTemplatePolicy.on_request():
# "This caused a very difficult-to-debug issue, because AzCLI's debug logs are written before this transformation.
# This means the logs do not accurately represent the bytes being sent to the server."Direct REST API Works
Users can successfully deploy the same template using direct REST API calls, proving the template itself is under 4MB and valid.
Proposed Fix
def on_request(self, request):
# ... existing code ...
if modified_data.get('properties', {}).get('template'):
template = modified_data["properties"]["template"]
del modified_data["properties"]["template"]
# FIX: Unescape the template to restore original size
try:
# Parse the escaped string back to the original JSON content
unescaped_template = json.loads(template)
# Convert it back to a properly formatted JSON string
template_content = json.dumps(unescaped_template, separators=(',', ':'))
except (json.JSONDecodeError, TypeError) as e:
# Fallback to original escaped version if parsing fails
logger.warning("Failed to unescape template, using original: %s", str(e))
template_content = template
partial_request = json.dumps(modified_data)
json_data = partial_request[:-2] + ", template:" + template_content + r"}}"
http_request.data = json_data.encode('utf-8')Impact
- Severity: High - Prevents deployment of large (but valid) templates
- Scope: All Azure CLI users deploying templates near the 4MB limit
- Workaround: Use direct REST API calls or Azure PowerShell
Additional Context
This bug explains why some users report inconsistent behavior where:
- Templates work via REST API but fail via Azure CLI
- The same template sometimes works and sometimes doesn't (depending on formatting/whitespace)
--debuglogs show different payload sizes than expected
The policy was intended to preserve template whitespace by using JSONC format, but the implementation has this escaping bug that defeats the purpose and creates larger payloads instead.