Skip to content

Commit 79380af

Browse files
committed
docs(readme): document security features and parsing limits
- Add production-safe parser and security hardening feature descriptions - Document configurable safety limits for depth, array size, string length, and map size - Add security architecture section explaining iterative parsing and memory safety - Update API overview with PackWithLimits and ParseLimits documentation - Expand test suite and benchmark documentation with comprehensive details - Fix error naming from MsGPackError to MsgPackError for consistency - Improve markdown table formatting and clean up trailing whitespace
1 parent 190b5f3 commit 79380af

File tree

2 files changed

+185
-24
lines changed

2 files changed

+185
-24
lines changed

README.md

Lines changed: 92 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ An article introducing it: [Zig Msgpack](https://blog.nvimer.org/2025/09/20/zig-
1010

1111
- **Full MessagePack Support:** Implements all MessagePack types including the timestamp extension.
1212
- **Timestamp Support:** Complete implementation of MessagePack timestamp extension type (-1) with support for all three formats (32-bit, 64-bit, and 96-bit).
13+
- **Production-Safe Parser:** Iterative parser prevents stack overflow on deeply nested or malicious input.
14+
- **Security Hardened:** Configurable limits protect against DoS attacks (depth bombs, size bombs, etc.).
1315
- **Efficient:** Designed for high performance with minimal memory overhead.
1416
- **Type-Safe:** Leverages Zig's type system to ensure safety during serialization and deserialization.
1517
- **Simple API:** Offers a straightforward and easy-to-use API for encoding and decoding.
@@ -18,12 +20,12 @@ An article introducing it: [Zig Msgpack](https://blog.nvimer.org/2025/09/20/zig-
1820

1921
### Version Compatibility
2022

21-
| Zig Version | Library Version | Status |
22-
|-------------|----------------|---------|
23-
| 0.13 and older | 0.0.6 | Legacy support |
24-
| 0.14.0 | Current | ✅ Fully supported |
25-
| 0.15.x | Current | ✅ Fully supported |
26-
| 0.16.0-dev (nightly) | Current | ✅ Supported with compatibility layer |
23+
| Zig Version | Library Version | Status |
24+
| -------------------- | --------------- | ------------------------------------- |
25+
| 0.13 and older | 0.0.6 | Legacy support |
26+
| 0.14.0 | Current | ✅ Fully supported |
27+
| 0.15.x | Current | ✅ Fully supported |
28+
| 0.16.0-dev (nightly) | Current | ✅ Supported with compatibility layer |
2729

2830
> **Note:** For Zig 0.13 and older versions, please use version `0.0.6` of this library.
2931
> **Note:** Zig 0.16+ removes `std.io.FixedBufferStream`, but this library provides a compatibility layer to maintain the same API across all supported versions.
@@ -76,7 +78,7 @@ const msgpack = @import("msgpack");
7678
pub fn main() !void {
7779
const allocator = std.heap.page_allocator;
7880
var buffer: [1024]u8 = undefined;
79-
81+
8082
// Use the compatibility layer for cross-version support
8183
const compat = msgpack.compat;
8284
var write_buffer = compat.fixedBufferStream(&buffer);
@@ -100,7 +102,7 @@ pub fn main() !void {
100102
read_buffer.pos = 0;
101103
const decoded = try packer.read(allocator);
102104
defer decoded.free(allocator);
103-
105+
104106
const name = (try decoded.mapGet("name")).?.str.value();
105107
const age = (try decoded.mapGet("age")).?.uint;
106108
std.debug.print("Name: {s}, Age: {d}\n", .{ name, age });
@@ -143,7 +145,7 @@ read_buffer.pos = 0;
143145
const decoded_ts = try packer.read(allocator);
144146
defer decoded_ts.free(allocator);
145147
146-
std.debug.print("Timestamp: {}s + {}ns\n",
148+
std.debug.print("Timestamp: {}s + {}ns\n",
147149
.{ decoded_ts.timestamp.seconds, decoded_ts.timestamp.nanoseconds });
148150
std.debug.print("As float: {d}\n", .{ decoded_ts.timestamp.toFloat() });
149151
```
@@ -154,21 +156,91 @@ std.debug.print("As float: {d}\n", .{ decoded_ts.timestamp.toFloat() });
154156
// Type conversion with error handling
155157
const int_payload = msgpack.Payload.intToPayload(-42);
156158
const uint_result = int_payload.getUint() catch |err| switch (err) {
157-
msgpack.MsGPackError.INVALID_TYPE => {
159+
msgpack.MsgPackError.InvalidType => {
158160
std.debug.print("Cannot convert negative to unsigned\n");
159161
return;
160162
},
161163
else => return err,
162164
};
163165
```
164166
167+
### Security Features (Parsing Untrusted Data)
168+
169+
The library includes configurable safety limits to protect against malicious or corrupted MessagePack data:
170+
171+
```zig
172+
// Default limits (recommended for most use cases)
173+
const Packer = msgpack.Pack(
174+
*Writer, *Reader,
175+
Writer.Error, Reader.Error,
176+
Writer.write, Reader.read,
177+
);
178+
// Automatically protected against:
179+
// - Deep nesting attacks (max 1000 layers)
180+
// - Large array/map attacks (max 1M elements)
181+
// - Memory exhaustion (max 100MB strings)
182+
183+
// Custom limits for specific environments
184+
const StrictPacker = msgpack.PackWithLimits(
185+
*Writer, *Reader,
186+
Writer.Error, Reader.Error,
187+
Writer.write, Reader.read,
188+
.{
189+
.max_depth = 50, // Limit nesting to 50 layers
190+
.max_array_length = 10_000, // Max 10K array elements
191+
.max_map_size = 10_000, // Max 10K map pairs
192+
.max_string_length = 1024 * 1024, // Max 1MB strings
193+
},
194+
);
195+
```
196+
197+
**Security Guarantees:**
198+
199+
- ✅ **Never crashes** on malformed or malicious input
200+
- ✅ **No stack overflow** regardless of nesting depth (iterative parser)
201+
- ✅ **Bounded memory usage** with configurable limits
202+
- ✅ **Fast rejection** of invalid data (no resource exhaustion)
203+
204+
Possible security errors:
205+
206+
```zig
207+
msgpack.MsgPackError.MaxDepthExceeded // Nesting too deep
208+
msgpack.MsgPackError.ArrayTooLarge // Array claims too many elements
209+
msgpack.MsgPackError.MapTooLarge // Map claims too many pairs
210+
msgpack.MsgPackError.StringTooLong // String/binary data too large
211+
```
212+
165213
## API Overview
166214
167-
- **`msgpack.Pack`**: The main struct for packing and unpacking MessagePack data. It is initialized with read and write contexts.
215+
- **`msgpack.Pack`**: The main struct for packing and unpacking MessagePack data with default safety limits.
216+
- **`msgpack.PackWithLimits`**: Create a packer with custom safety limits for specific security requirements.
168217
- **`msgpack.Payload`**: A union that represents any MessagePack type. It provides methods for creating and interacting with different data types (e.g., `mapPayload`, `strToPayload`, `mapGet`).
218+
- **`msgpack.ParseLimits`**: Configuration struct for parser safety limits.
169219
170220
## Implementation Notes
171221
222+
### Security Architecture
223+
224+
This library uses an **iterative parser** (not recursive) to provide strong security guarantees:
225+
226+
**Iterative Parsing:**
227+
228+
- Parser uses an explicit stack (on heap) instead of recursive function calls
229+
- Stack depth remains constant regardless of input nesting depth
230+
- Prevents stack overflow attacks completely
231+
232+
**Safety Limits:**
233+
234+
- All limits are enforced **before** memory allocation
235+
- Invalid input is rejected immediately without resource consumption
236+
- Configurable limits allow tuning for specific environments (embedded, server, etc.)
237+
238+
**Memory Safety:**
239+
240+
- All error paths include complete cleanup (`errdefer` + `cleanupParseStack`)
241+
- Zero memory leaks verified by GPA (General Purpose Allocator) in tests
242+
- Safe to parse untrusted data from network, files, or user input
243+
172244
### Zig 0.16 Compatibility
173245
174246
Starting from Zig 0.16, the standard library underwent significant changes to the I/O subsystem. The `std.io.FixedBufferStream` was removed as part of a broader redesign. This library includes a compatibility layer (`src/compat.zig`) that:
@@ -190,6 +262,14 @@ zig build test
190262
zig build test --summary all
191263
```
192264
265+
The comprehensive test suite includes:
266+
267+
- **87 tests** covering all functionality
268+
- **Malicious data tests:** Verify protection against crafted attacks (billion-element arrays, extreme nesting, etc.)
269+
- **Fuzz tests:** Random input validation ensures no crashes on arbitrary data
270+
- **Large data tests:** Arrays with 1000+ elements, maps with 500+ pairs
271+
- **Memory safety:** Zero leaks verified by strict allocator testing
272+
193273
## Benchmarks
194274
195275
To run performance benchmarks:
@@ -203,6 +283,7 @@ zig build bench -Doptimize=ReleaseFast
203283
```
204284
205285
The benchmark suite includes:
286+
206287
- Basic types (nil, bool, integers, floats)
207288
- Strings and binary data of various sizes
208289
- Arrays and maps (small, medium, large)

README_CN.md

Lines changed: 93 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ Zig 编程语言的 MessagePack 实现。此库提供了一种简单高效的方
1010

1111
- **完整的 MessagePack 支持**: 实现了所有 MessagePack 类型,包括时间戳扩展类型。
1212
- **时间戳支持**: 完整实现 MessagePack 时间戳扩展类型 (-1),支持所有三种格式(32位、64位和96位)。
13+
- **生产级安全解析器**: 迭代式解析器防止深度嵌套或恶意输入导致的栈溢出。
14+
- **安全加固**: 可配置的限制保护,防御 DoS 攻击(深度炸弹、大小炸弹等)。
1315
- **高效**: 设计追求高性能,内存开销最小。
1416
- **类型安全**: 利用 Zig 的类型系统确保序列化和反序列化期间的安全性。
1517
- **简单的 API**: 提供直观易用的编码和解码 API。
@@ -18,11 +20,11 @@ Zig 编程语言的 MessagePack 实现。此库提供了一种简单高效的方
1820

1921
### 版本兼容性
2022

21-
| Zig 版本 | 库版本 | 状态 |
22-
|-------------|----------------|---------|
23-
| 0.13 及更早版本 | 0.0.6 | 旧版支持 |
24-
| 0.14.0 | 当前版本 | ✅ 完全支持 |
25-
| 0.15.x | 当前版本 | ✅ 完全支持 |
23+
| Zig 版本 | 库版本 | 状态 |
24+
| -------------------- | -------- | ----------------- |
25+
| 0.13 及更早版本 | 0.0.6 | 旧版支持 |
26+
| 0.14.0 | 当前版本 | ✅ 完全支持 |
27+
| 0.15.x | 当前版本 | ✅ 完全支持 |
2628
| 0.16.0-dev (nightly) | 当前版本 | ✅ 通过兼容层支持 |
2729

2830
> **注意**: 对于 Zig 0.13 及更早版本,请使用本库的 `0.0.6` 版本。
@@ -76,7 +78,7 @@ const msgpack = @import("msgpack");
7678
pub fn main() !void {
7779
const allocator = std.heap.page_allocator;
7880
var buffer: [1024]u8 = undefined;
79-
81+
8082
// 使用兼容层实现跨版本支持
8183
const compat = msgpack.compat;
8284
var write_buffer = compat.fixedBufferStream(&buffer);
@@ -100,7 +102,7 @@ pub fn main() !void {
100102
read_buffer.pos = 0;
101103
const decoded = try packer.read(allocator);
102104
defer decoded.free(allocator);
103-
105+
104106
const name = (try decoded.mapGet("姓名")).?.str.value();
105107
const age = (try decoded.mapGet("年龄")).?.uint;
106108
std.debug.print("姓名: {s}, 年龄: {d}\n", .{ name, age });
@@ -143,7 +145,7 @@ read_buffer.pos = 0;
143145
const decoded_ts = try packer.read(allocator);
144146
defer decoded_ts.free(allocator);
145147
146-
std.debug.print("时间戳: {}秒 + {}纳秒\n",
148+
std.debug.print("时间戳: {}秒 + {}纳秒\n",
147149
.{ decoded_ts.timestamp.seconds, decoded_ts.timestamp.nanoseconds });
148150
std.debug.print("浮点数形式: {d}\n", .{ decoded_ts.timestamp.toFloat() });
149151
```
@@ -154,21 +156,91 @@ std.debug.print("浮点数形式: {d}\n", .{ decoded_ts.timestamp.toFloat() });
154156
// 类型转换与错误处理
155157
const int_payload = msgpack.Payload.intToPayload(-42);
156158
const uint_result = int_payload.getUint() catch |err| switch (err) {
157-
msgpack.MsGPackError.INVALID_TYPE => {
158-
std.debug.print("无法将负数转换为无符号整数\n");
159+
msgpack.MsgPackError.InvalidType => {
160+
std.debug.print("无法将负数转换为无符号整数\n", .{});
159161
return;
160162
},
161163
else => return err,
162164
};
163165
```
164166

167+
### 安全特性(解析不可信数据)
168+
169+
本库包含可配置的安全限制,用于防护恶意或损坏的 MessagePack 数据:
170+
171+
```zig
172+
// 默认限制(推荐用于大多数场景)
173+
const Packer = msgpack.Pack(
174+
*Writer, *Reader,
175+
Writer.Error, Reader.Error,
176+
Writer.write, Reader.read,
177+
);
178+
// 自动防护:
179+
// - 深度嵌套攻击(最大 1000 层)
180+
// - 大数组/Map 攻击(最大 100 万元素)
181+
// - 内存耗尽(最大 100MB 字符串)
182+
183+
// 针对特定环境的自定义限制
184+
const StrictPacker = msgpack.PackWithLimits(
185+
*Writer, *Reader,
186+
Writer.Error, Reader.Error,
187+
Writer.write, Reader.read,
188+
.{
189+
.max_depth = 50, // 限制嵌套到 50 层
190+
.max_array_length = 10_000, // 最大 1 万个数组元素
191+
.max_map_size = 10_000, // 最大 1 万个 map 键值对
192+
.max_string_length = 1024 * 1024, // 最大 1MB 字符串
193+
},
194+
);
195+
```
196+
197+
**安全保证**:
198+
199+
-**永不崩溃** - 任何畸形或恶意输入都不会导致崩溃
200+
-**无栈溢出** - 无论嵌套深度如何(迭代式解析器)
201+
-**内存可控** - 通过可配置限制控制内存使用
202+
-**快速拒绝** - 无效数据被立即拒绝(无资源耗尽)
203+
204+
可能的安全错误:
205+
206+
```zig
207+
msgpack.MsgPackError.MaxDepthExceeded // 嵌套过深
208+
msgpack.MsgPackError.ArrayTooLarge // 数组声称过多元素
209+
msgpack.MsgPackError.MapTooLarge // Map 声称过多键值对
210+
msgpack.MsgPackError.StringTooLong // 字符串/二进制数据过大
211+
```
212+
165213
## API 概览
166214

167-
- **`msgpack.Pack`**: 用于打包和解包 MessagePack 数据的主要结构体。使用读写上下文进行初始化。
168-
- **`msgpack.Payload`**: 表示任何 MessagePack 类型的联合体。提供创建和与不同数据类型交互的方法(例如,`mapPayload``strToPayload``mapGet`)。
215+
- **`msgpack.Pack`**: 用于打包和解包 MessagePack 数据的主要结构体,带默认安全限制。
216+
- **`msgpack.PackWithLimits`**: 创建带自定义安全限制的 packer,满足特定安全需求。
217+
- **`msgpack.Payload`**: 表示任何 MessagePack 类型的联合体。提供创建和与不同数据类型交互的方法(例如 `mapPayload``strToPayload``mapGet`)。
218+
- **`msgpack.ParseLimits`**: 解析器安全限制的配置结构体。
169219

170220
## 实现说明
171221

222+
### 安全架构
223+
224+
本库使用**迭代式解析器**(非递归)提供强大的安全保证:
225+
226+
**迭代式解析**
227+
228+
- 解析器使用显式栈(堆上)而非递归函数调用
229+
- 栈深度恒定,与输入嵌套深度无关
230+
- 完全防止栈溢出攻击
231+
232+
**安全限制**
233+
234+
- 所有限制在内存分配**之前**强制执行
235+
- 无效输入被立即拒绝,不消耗资源
236+
- 可配置限制允许针对特定环境调整(嵌入式、服务器等)
237+
238+
**内存安全**
239+
240+
- 所有错误路径包含完整清理(`errdefer` + `cleanupParseStack`
241+
- 零内存泄漏(测试中由 GPA 验证)
242+
- 可安全解析来自网络、文件或用户输入的不可信数据
243+
172244
### Zig 0.16 兼容性
173245

174246
从 Zig 0.16 开始,标准库的 I/O 子系统经历了重大变更。作为更广泛重新设计的一部分,`std.io.FixedBufferStream` 被移除。本库包含一个兼容层(`src/compat.zig`),它:
@@ -190,6 +262,14 @@ zig build test
190262
zig build test --summary all
191263
```
192264

265+
综合测试套件包括:
266+
267+
- **87 个测试** 覆盖所有功能
268+
- **恶意数据测试**:验证针对精心构造的攻击(数十亿元素数组、极端嵌套等)的防护
269+
- **模糊测试**:随机输入验证,确保任意数据都不会崩溃
270+
- **大数据测试**:1000+ 元素的数组、500+ 键值对的 map
271+
- **内存安全**:严格分配器测试验证零泄漏
272+
193273
## 性能基准测试
194274

195275
运行性能基准测试:
@@ -203,6 +283,7 @@ zig build bench -Doptimize=ReleaseFast
203283
```
204284

205285
基准测试套件包括:
286+
206287
- 基本类型(nil、bool、整数、浮点数)
207288
- 不同大小的字符串和二进制数据
208289
- 数组和映射表(小型、中型、大型)
@@ -231,4 +312,3 @@ zig build docs
231312
## 许可证
232313

233314
此项目在 MIT 许可证下许可。详情请参阅 [LICENSE](LICENSE) 文件。
234-

0 commit comments

Comments
 (0)