|
| 1 | +# Normalized App Compose |
| 2 | + |
| 3 | +In the dstack project, the `app-compose.json` file defines application composition and deployment settings. To track changes and ensure data integrity across different environments, dstack needs to generate a deterministic SHA256 compose hash from this file. |
| 4 | + |
| 5 | +A compose hash is a SHA256 cryptographic hash computed from the `app-compose.json` content. This hash acts as a unique fingerprint for each application composition. When dstack processes the same `app-compose.json` file across different components - some built in Go, others in Python or JavaScript - they must all produce the exact same compose hash. This consistency is critical for dstack's distributed architecture and change detection system. |
| 6 | + |
| 7 | +The main problem is that standard JSON libraries in different languages often create slightly different output from the same data. Small differences in key order, whitespace, or number formatting lead to different JSON strings. These create different compose hashes, which breaks dstack's integrity checks. |
| 8 | + |
| 9 | +This document explains the rules for JSON serialization in Go, Python, and JavaScript to achieve deterministic output. Following these rules ensures the same `app-compose.json` file always produces the same SHA256 compose hash across all dstack components. |
| 10 | + |
| 11 | +## Core Rules for Deterministic JSON |
| 12 | + |
| 13 | +For dstack to generate consistent SHA256 compose hashes, JSON serialization must follow these strict rules: |
| 14 | + |
| 15 | +- **Sort Keys**: All keys in JSON objects must be sorted alphabetically |
| 16 | +- **Compact Output**: The JSON string must have no extra whitespace |
| 17 | +- **Handle Special Values**: NaN and Infinity should be serialized as null |
| 18 | +- **UTF-8 Encoding**: Non-ASCII characters should output directly as UTF-8, not as escape sequences |
| 19 | + |
| 20 | +## Go: encoding/json |
| 21 | + |
| 22 | +Go's standard library provides JSON encoding and decoding. By default, it creates compact output, but you need to watch key ordering and special value handling. |
| 23 | + |
| 24 | +**Key Setup:** |
| 25 | +- **Key Order**: Go serializes structs by field definition order. For `map[string]interface{}`, Go doesn't guarantee key order. To get sorted keys, convert to a map, extract and sort keys manually, then serialize. Better yet, use structs with fixed field order. |
| 26 | +- **Compact Output**: `json.Marshal()` creates compact JSON by default |
| 27 | +- **Special Values**: Go serializes NaN and Infinity to null by default |
| 28 | +- **UTF-8**: Outputs UTF-8 characters by default |
| 29 | + |
| 30 | +**Example (Go):** |
| 31 | + |
| 32 | +```go |
| 33 | +package main |
| 34 | + |
| 35 | +import ( |
| 36 | + "encoding/json" |
| 37 | + "fmt" |
| 38 | + "sort" |
| 39 | +) |
| 40 | + |
| 41 | +// AppComposeData represents the structure of app-compose.json |
| 42 | +type AppComposeData struct { |
| 43 | + AStatus bool `json:"a_status"` |
| 44 | + BNumber int `json:"b_number"` |
| 45 | + ID string `json:"id"` |
| 46 | + Nested map[string]interface{} `json:"nested"` |
| 47 | + SpecialValue *float64 `json:"special_value"` |
| 48 | + Text string `json:"text"` |
| 49 | + ZItems []int `json:"z_items"` |
| 50 | +} |
| 51 | + |
| 52 | +// CustomMap for custom map serialization |
| 53 | +type CustomMap map[string]interface{} |
| 54 | + |
| 55 | +func (cm CustomMap) MarshalJSON() ([]byte, error) { |
| 56 | + keys := make([]string, 0, len(cm)) |
| 57 | + for k := range cm { |
| 58 | + keys = append(keys, k) |
| 59 | + } |
| 60 | + sort.Strings(keys) // Sort keys alphabetically |
| 61 | + |
| 62 | + var buf []byte |
| 63 | + buf = append(buf, '{') |
| 64 | + for i, k := range keys { |
| 65 | + if i > 0 { |
| 66 | + buf = append(buf, ',') |
| 67 | + } |
| 68 | + keyBytes, err := json.Marshal(k) |
| 69 | + if err != nil { |
| 70 | + return nil, err |
| 71 | + } |
| 72 | + buf = append(buf, keyBytes...) |
| 73 | + buf = append(buf, ':') |
| 74 | + valBytes, err := json.Marshal(cm[k]) |
| 75 | + if err != nil { |
| 76 | + return nil, err |
| 77 | + } |
| 78 | + buf = append(buf, valBytes...) |
| 79 | + } |
| 80 | + buf = append(buf, '}') |
| 81 | + return buf, nil |
| 82 | +} |
| 83 | + |
| 84 | +func main() { |
| 85 | + // Example app-compose.json data |
| 86 | + nestedMap := CustomMap{ |
| 87 | + "gamma": 3.14, |
| 88 | + "alpha": "first", |
| 89 | + } |
| 90 | + |
| 91 | + var nanVal *float64 = nil // Handle NaN as null |
| 92 | + |
| 93 | + composeData := AppComposeData{ |
| 94 | + AStatus: true, |
| 95 | + BNumber: 123, |
| 96 | + ID: "c73a3a4e-ce71-4c12-a1b7-78be1a2e48e0", |
| 97 | + Nested: nestedMap, |
| 98 | + SpecialValue: nanVal, |
| 99 | + Text: "你好世界", |
| 100 | + ZItems: []int{3, 1, 2}, |
| 101 | + } |
| 102 | + |
| 103 | + // Generate deterministic JSON for compose hash |
| 104 | + jsonBytes, err := json.Marshal(composeData) |
| 105 | + if err != nil { |
| 106 | + fmt.Println("Error:", err) |
| 107 | + return |
| 108 | + } |
| 109 | + fmt.Println("Deterministic JSON:", string(jsonBytes)) |
| 110 | + |
| 111 | + // This JSON string can now be used to generate a compose hash |
| 112 | +} |
| 113 | +``` |
| 114 | + |
| 115 | +**Go Notes:** |
| 116 | +- **Struct Field Order**: Go serializes structs by field definition order. Arrange struct fields alphabetically for consistency |
| 117 | +- **Map Key Order**: Go doesn't guarantee map key order. Use custom `json.Marshaler` interface to sort keys manually |
| 118 | +- **NaN/Infinity**: Go serializes these to null by default |
| 119 | + |
| 120 | +## Python: json.dumps |
| 121 | + |
| 122 | +Python's `json.dumps` has parameters to achieve deterministic output, but you must set them explicitly. |
| 123 | + |
| 124 | +**Setup:** |
| 125 | +- `sort_keys=True`: Sorts dictionary keys alphabetically |
| 126 | +- `separators=(',', ':')`: Creates compact output by removing spaces |
| 127 | +- `ensure_ascii=False`: Outputs non-ASCII characters as UTF-8 |
| 128 | +- `allow_nan=False`: Disables default NaN/Infinity serialization, handles them via custom function |
| 129 | + |
| 130 | +**Example (Python):** |
| 131 | + |
| 132 | +```python |
| 133 | +import json |
| 134 | +import math |
| 135 | + |
| 136 | +def handle_nan_inf(obj): |
| 137 | + if isinstance(obj, float) and (math.isnan(obj) or math.isinf(obj)): |
| 138 | + return None # Convert NaN, Inf, -Inf to None (serializes to null) |
| 139 | + raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable") |
| 140 | + |
| 141 | +# Example app-compose.json data |
| 142 | +compose_data = { |
| 143 | + "text": "你好世界", |
| 144 | + "id": "c73a3a4e-ce71-4c12-a1b7-78be1a2e48e0", |
| 145 | + "b_number": 123, |
| 146 | + "a_status": True, |
| 147 | + "z_items": [3, 1, 2], |
| 148 | + "nested": { |
| 149 | + "gamma": 3.14, |
| 150 | + "alpha": "first" |
| 151 | + }, |
| 152 | + "special_value": float('nan') |
| 153 | +} |
| 154 | + |
| 155 | +# Generate deterministic JSON for compose hash |
| 156 | +deterministic_json = json.dumps( |
| 157 | + compose_data, |
| 158 | + sort_keys=True, |
| 159 | + separators=(",", ":"), |
| 160 | + ensure_ascii=False, |
| 161 | + allow_nan=False, |
| 162 | + default=handle_nan_inf |
| 163 | +) |
| 164 | + |
| 165 | +print("Deterministic JSON:", deterministic_json) |
| 166 | +# This JSON string can now be used to generate a compose hash |
| 167 | +``` |
| 168 | + |
| 169 | +## JavaScript: JSON.stringify |
| 170 | + |
| 171 | +JavaScript's `JSON.stringify` is the hardest for deterministic output because it lacks a built-in sort keys option. Object key order is usually insertion order, but this isn't guaranteed to be alphabetical. |
| 172 | + |
| 173 | +**Approach:** |
| 174 | +- **Sort Object Keys**: Before calling `JSON.stringify`, recursively sort all object keys alphabetically |
| 175 | +- **Compact Output**: Call `JSON.stringify` without the space argument |
| 176 | +- **Special Values**: Use replacer function to convert NaN and Infinity to null |
| 177 | + |
| 178 | +**Example (JavaScript):** |
| 179 | + |
| 180 | +```javascript |
| 181 | +/** |
| 182 | + * Sorts object keys alphabetically. |
| 183 | + * This is crucial for deterministic JSON.stringify in JavaScript. |
| 184 | + */ |
| 185 | +function sortObjectKeys(obj) { |
| 186 | + if (typeof obj !== 'object' || obj === null) { |
| 187 | + return obj; |
| 188 | + } |
| 189 | + if (Array.isArray(obj)) { |
| 190 | + return obj.map(sortObjectKeys); |
| 191 | + } |
| 192 | + // Sort object keys and create new object |
| 193 | + return Object.keys(obj).sort().reduce((result, key) => { |
| 194 | + result[key] = sortObjectKeys(obj[key]); |
| 195 | + return result; |
| 196 | + }, {}); |
| 197 | +} |
| 198 | + |
| 199 | +// Example app-compose.json data |
| 200 | +const composeData = { |
| 201 | + text: "你好世界", |
| 202 | + id: "c73a3a4e-ce71-4c12-a1b7-78be1a2e48e0", |
| 203 | + b_number: 123, |
| 204 | + a_status: true, |
| 205 | + z_items: [3, 1, 2], |
| 206 | + nested: { |
| 207 | + gamma: 3.14, |
| 208 | + alpha: "first" |
| 209 | + }, |
| 210 | + special_value: NaN |
| 211 | +}; |
| 212 | + |
| 213 | +// Step 1: Sort object keys |
| 214 | +const sortedData = sortObjectKeys(composeData); |
| 215 | + |
| 216 | +// Step 2: Generate deterministic JSON for compose hash |
| 217 | +const deterministicJson = JSON.stringify(sortedData, (key, value) => { |
| 218 | + // Convert NaN and Infinity to null |
| 219 | + if (typeof value === 'number' && (isNaN(value) || !isFinite(value))) { |
| 220 | + return null; |
| 221 | + } |
| 222 | + return value; |
| 223 | +}); |
| 224 | + |
| 225 | +console.log("Deterministic JSON:", deterministicJson); |
| 226 | +// This JSON string can now be used to generate a compose hash |
| 227 | +``` |
| 228 | + |
| 229 | +## Language Comparison |
| 230 | + |
| 231 | +Here's how each language handles deterministic JSON serialization for compose hash generation: |
| 232 | + |
| 233 | +| Feature | Go encoding/json | Python json.dumps | JavaScript JSON.stringify | |
| 234 | +|:---|:---|:---|:---| |
| 235 | +| Key Order | Structs by definition order; maps need custom MarshalJSON | Not guaranteed; must set `sort_keys=True` | Not guaranteed; must sort keys manually | |
| 236 | +| Whitespace | Compact by default | Has spaces by default; must set `separators=(',', ':')` | Has indentation by default; must omit space argument | |
| 237 | +| NaN/Inf | Serializes to null by default | Defaults to JS equivalent; must set `allow_nan=False` | Serializes to null by default; use replacer function | |
| 238 | +| Non-ASCII | Outputs UTF-8 by default | Defaults to escaped; must set `ensure_ascii=False` | Outputs UTF-8 by default | |
| 239 | +| Custom Types | Use `json.Marshaler` interface | Use `default` parameter | Use replacer function | |
| 240 | + |
| 241 | +## Summary |
| 242 | + |
| 243 | +Getting deterministic JSON serialization across different languages for compose hash generation isn't the default behavior. It needs careful setup. Go works well with compact output and special value handling, but needs custom key sorting for maps. Python and JavaScript both need explicit setup for key sorting and compact output. JavaScript notably requires manual recursive sorting of object keys. |
| 244 | + |
| 245 | +By following these recommendations, dstack can ensure that the same `app-compose.json` file produces the same SHA256 compose hash across all its Go, Python, and JavaScript components. This provides a reliable foundation for the project's distributed architecture and change detection system. |
| 246 | + |
0 commit comments