Skip to content

Commit 7e1b121

Browse files
adamsitnikgewarren
andauthored
Extend the NRBF docs and focus on safety (#44214)
* add a warning about not using NrbfDecoder to determine whether it's safe to call BinaryFormatter * warn the users to not negate these safeguards of NrbfDecoder by doing things like unbound recursion * let the users know about cycles possibility to use SerializationRecordId to detect these, extend one of the samples with simple cycle detection * extend TypeNameMatches sample to include derived type hierarchy and also show that users need to be defensive and reject records of unexpected types * recommend to check the total length of the array record before calling GetArray * correct the statement that is no longer true * deadly trailing whitespace * Apply suggestions from code review Co-authored-by: Genevieve Warren <[email protected]> * Update docs/standard/serialization/binaryformatter-migration-guide/read-nrbf-payloads.md Co-authored-by: Genevieve Warren <[email protected]> --------- Co-authored-by: Genevieve Warren <[email protected]>
1 parent 7d4debd commit 7e1b121

File tree

1 file changed

+62
-17
lines changed

1 file changed

+62
-17
lines changed

docs/standard/serialization/binaryformatter-migration-guide/read-nrbf-payloads.md

Lines changed: 62 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ helpviewer_keywords:
1818

1919
As part of .NET 9, a new [NrbfDecoder] class was introduced to decode NRBF payloads without performing _deserialization_ of the payload. This API can safely be used to decode trusted or untrusted payloads without any of the risks that [BinaryFormatter] deserialization carries. However, [NrbfDecoder] merely decodes the data into structures an application can further process. Care must be taken when using [NrbfDecoder] to safely load the data into the appropriate instances.
2020

21+
> [!CAUTION]
22+
> [NrbfDecoder] is _an_ implementation of an NRBF reader, but its behaviors don't strictly follow [BinaryFormatter]'s implementation. Thus you shouldn't use the output of [NrbfDecoder] to determine whether a call to [BinaryFormatter] would be safe.
23+
2124
You can think of <xref:System.Formats.Nrbf.NrbfDecoder> as being the equivalent of using a JSON/XML reader without the deserializer.
2225

2326
## NrbfDecoder
@@ -33,7 +36,8 @@ You can think of <xref:System.Formats.Nrbf.NrbfDecoder> as being the equivalent
3336
- Use collision-resistant randomized hashing to store records referenced by other records (to avoid running out of memory for dictionary backed by an array whose size depends on the number of hash-code collisions).
3437
- Only primitive types can be instantiated in an implicit way. Arrays can be instantiated on demand. Other types are never instantiated.
3538

36-
When using [NrbfDecoder], it is important not to reintroduce those capabilities in general-purpose code as doing so would negate these safeguards.
39+
> [!CAUTION]
40+
> When using [NrbfDecoder], it's important not to reintroduce those capabilities in general-purpose code, as doing so would negate these safeguards.
3741
3842
### Deserialize a closed set of types
3943

@@ -82,17 +86,38 @@ internal static T LoadFromFile<T>(string path)
8286

8387
The NRBF payload consists of serialization records that represent the serialized objects and their metadata. To read the whole payload and get the root object, you need to call the <xref:System.Formats.Nrbf.NrbfDecoder.Decode*> method.
8488

85-
The <xref:System.Formats.Nrbf.NrbfDecoder.Decode*> method returns a <xref:System.Formats.Nrbf.SerializationRecord> instance. <xref:System.Formats.Nrbf.SerializationRecord> is an abstract class that represents the serialization record and provides three self-describing properties: <xref:System.Formats.Nrbf.SerializationRecord.Id>, <xref:System.Formats.Nrbf.SerializationRecord.RecordType>, and <xref:System.Formats.Nrbf.SerializationRecord.TypeName>. It exposes one method, <xref:System.Formats.Nrbf.SerializationRecord.TypeNameMatches*>, which compares the type name read from the payload (and exposed via <xref:System.Formats.Nrbf.SerializationRecord.TypeName> property) against the specified type. This method ignores assembly names, so users don't need to worry about type forwarding and assembly versioning. It also does not consider member names or their types (because getting this information would require type loading).
89+
The <xref:System.Formats.Nrbf.NrbfDecoder.Decode*> method returns a <xref:System.Formats.Nrbf.SerializationRecord> instance. <xref:System.Formats.Nrbf.SerializationRecord> is an abstract class that represents the serialization record and provides three self-describing properties: <xref:System.Formats.Nrbf.SerializationRecord.Id>, <xref:System.Formats.Nrbf.SerializationRecord.RecordType>, and <xref:System.Formats.Nrbf.SerializationRecord.TypeName>.
90+
91+
> [!NOTE]
92+
> An attacker could create a payload with cycles (example: class or an array of objects with a reference to itself). The <xref:System.Formats.Nrbf.SerializationRecord.Id> returns an instance of <xref:System.Formats.Nrbf.SerializationRecordId> which implements <xref:System.IEquatable%601> and amongst other things, it can be used to detect cycles in decoded records.
93+
94+
<xref:System.Formats.Nrbf.SerializationRecord> exposes one method, <xref:System.Formats.Nrbf.SerializationRecord.TypeNameMatches*>, which compares the type name read from the payload (and exposed via <xref:System.Formats.Nrbf.SerializationRecord.TypeName> property) against the specified type. This method ignores assembly names, so users don't need to worry about type forwarding and assembly versioning. It also does not consider member names or their types (because getting this information would require type loading).
8695

8796
```csharp
8897
using System.Formats.Nrbf;
8998

90-
static T Pseudocode<T>(Stream payload)
99+
static Animal Pseudocode(Stream payload)
91100
{
92101
SerializationRecord record = NrbfDecoder.Read(payload);
93-
if (!record.TypeNameMatches(typeof(T))
102+
if (record.TypeNameMatches(typeof(Cat)) && record is ClassRecord catRecord)
103+
{
104+
return new Cat()
105+
{
106+
Name = catRecord.GetString("Name"),
107+
WorshippersCount = catRecord.GetInt32("WorshippersCount")
108+
};
109+
}
110+
else if (record.TypeNameMatches(typeof(Dog)) && record is ClassRecord dogRecord)
94111
{
95-
throw new Exception($"Expected the record to match type name `{typeof(T).AssemblyQualifiedName}`, but got `{record.TypeName.AssemblyQualifiedName}`."
112+
return new Dog()
113+
{
114+
Name = dogRecord.GetString("Name"),
115+
FriendsCount = dogRecord.GetInt32("FriendsCount")
116+
};
117+
}
118+
else
119+
{
120+
throw new Exception($"Unexpected record: `{record.TypeName.AssemblyQualifiedName}`.");
96121
}
97122
}
98123
```
@@ -104,7 +129,7 @@ There are more than a dozen different serialization [record types](/openspecs/wi
104129
- <xref:System.Formats.Nrbf.PrimitiveTypeRecord%601> derives from the non-generic <xref:System.Formats.Nrbf.PrimitiveTypeRecord>, which also exposes a <xref:System.Formats.Nrbf.PrimitiveTypeRecord.Value> property. But on the base class, the value is returned as `object` (which introduces boxing for value types).
105130
- <xref:System.Formats.Nrbf.ClassRecord>: describes all `class` and `struct` besides the aforementioned primitive types.
106131
- <xref:System.Formats.Nrbf.ArrayRecord>: describes all array records, including jagged and multi-dimensional arrays.
107-
- <xref:System.Formats.Nrbf.SZArrayRecord%601>: describes single-dimensional, zero-indexed array records, where `T` can be either a primitive type or a <xref:System.Formats.Nrbf.ClassRecord>.
132+
- <xref:System.Formats.Nrbf.SZArrayRecord%601>: describes single-dimensional, zero-indexed array records, where `T` can be either a primitive type or a <xref:System.Formats.Nrbf.SerializationRecord>.
108133

109134
```csharp
110135
SerializationRecord rootObject = NrbfDecoder.Decode(payload); // payload is a Stream
@@ -134,7 +159,8 @@ The API it provides:
134159
- <xref:System.Formats.Nrbf.ClassRecord.MemberNames> property that gets the names of serialized members.
135160
- <xref:System.Formats.Nrbf.ClassRecord.HasMember*> method that checks if member of given name was present in the payload. It was designed for handling versioning scenarios where given member could have been renamed.
136161
- A set of dedicated methods for retrieving primitive values of the provided member name: <xref:System.Formats.Nrbf.ClassRecord.GetString*>, <xref:System.Formats.Nrbf.ClassRecord.GetBoolean*>, <xref:System.Formats.Nrbf.ClassRecord.GetByte*>, <xref:System.Formats.Nrbf.ClassRecord.GetSByte*>, <xref:System.Formats.Nrbf.ClassRecord.GetChar*>, <xref:System.Formats.Nrbf.ClassRecord.GetInt16*>, <xref:System.Formats.Nrbf.ClassRecord.GetUInt16*>, <xref:System.Formats.Nrbf.ClassRecord.GetInt32*>, <xref:System.Formats.Nrbf.ClassRecord.GetUInt32*>, <xref:System.Formats.Nrbf.ClassRecord.GetInt64*>, <xref:System.Formats.Nrbf.ClassRecord.GetUInt64*>, <xref:System.Formats.Nrbf.ClassRecord.GetSingle*>, <xref:System.Formats.Nrbf.ClassRecord.GetDouble*>, <xref:System.Formats.Nrbf.ClassRecord.GetDecimal*>, <xref:System.Formats.Nrbf.ClassRecord.GetTimeSpan*>, and <xref:System.Formats.Nrbf.ClassRecord.GetDateTime*>.
137-
- <xref:System.Formats.Nrbf.ClassRecord.GetClassRecord*> and <xref:System.Formats.Nrbf.ClassRecord.GetArrayRecord*> methods to retrieve instance of given record types.
162+
- <xref:System.Formats.Nrbf.ClassRecord.GetClassRecord*> retrieves an instance of [ClassRecord]. In case of a cycle, it's the same instance of the current [ClassRecord] with the same <xref:System.Formats.Nrbf.SerializationRecord.Id>.
163+
- <xref:System.Formats.Nrbf.ClassRecord.GetArrayRecord*> retrieves an instance of [ArrayRecord].
138164
- <xref:System.Formats.Nrbf.ClassRecord.GetSerializationRecord*> to retrieve any serialization record and <xref:System.Formats.Nrbf.ClassRecord.GetRawValue*> to retrieve any serialization record or a raw primitive value.
139165

140166
The following code snippet shows <xref:System.Formats.Nrbf.ClassRecord> in action:
@@ -157,34 +183,49 @@ Sample output = new()
157183
Text = rootRecord.GetString(nameof(Sample.Text)),
158184
// using dedicated method to read an array of bytes
159185
ArrayOfBytes = ((SZArrayRecord<byte>)rootRecord.GetArrayRecord(nameof(Sample.ArrayOfBytes))).GetArray(),
160-
// using GetClassRecord to read a class record
161-
ClassInstance = new()
186+
};
187+
188+
// using GetClassRecord to read a class record
189+
ClassRecord? referenced = rootRecord.GetClassRecord(nameof(Sample.ClassInstance));
190+
if (referenced is not null)
191+
{
192+
if (referenced.Id.Equals(rootRecord.Id))
162193
{
163-
Text = rootRecord
164-
.GetClassRecord(nameof(Sample.ClassInstance))!
165-
.GetString(nameof(Sample.Text))
194+
throw new Exception("Unexpected cycle detected!");
166195
}
167-
};
196+
197+
output.ClassInstance = new()
198+
{
199+
Text = referenced.GetString(nameof(Sample.Text))
200+
};
201+
}
168202
```
169203

170204
#### ArrayRecord
171205

172206
<xref:System.Formats.Nrbf.ArrayRecord> defines the core behavior for NRBF array records and provides a base for derived classes. It provides two properties:
173207

174-
- <xref:System.Formats.Nrbf.ArrayRecord.Rank> which gets the rank of the array.
175-
- <xref:System.Formats.Nrbf.ArrayRecord.Lengths> which get a buffer of integers that represent the number of elements in every dimension.
208+
- <xref:System.Formats.Nrbf.ArrayRecord.Rank>, which gets the rank of the array.
209+
- <xref:System.Formats.Nrbf.ArrayRecord.Lengths>, which gets a buffer of integers that represent the number of elements in every dimension. It's recommended to **check the total length of the provided array record** before calling <xref:System.Formats.Nrbf.ArrayRecord.GetArray*>.
176210

177211
It also provides one method: <xref:System.Formats.Nrbf.ArrayRecord.GetArray*>. When used for the first time, it allocates an array and fills it with the data provided in the serialized records (in case of the natively supported primitive types like `string` or `int`) or the serialized records themselves (in case of arrays of complex types).
178212

179213
<xref:System.Formats.Nrbf.ArrayRecord.GetArray*> requires a mandatory argument that specifies the type of the expected array. For example, if the record should be a 2D array of integers, the `expectedArrayType` must be provided as `typeof(int[,])` and the returned array is also `int[,]`:
180214

181215
```csharp
182216
ArrayRecord arrayRecord = (ArrayRecord)NrbfDecoder.Decode(stream);
217+
if (arrayRecord.Rank != 2 || arrayRecord.Lengths[0] * arrayRecord.Lengths[1] > 10_000)
218+
{
219+
throw new Exception("The array had unexpected rank or length!");
220+
}
183221
int[,] array2d = (int[,])arrayRecord.GetArray(typeof(int[,]));
184222
```
185223

186224
If there is a type mismatch (example: the attacker has provided a payload with an array of two billion strings), the method throws <xref:System.InvalidOperationException>.
187225

226+
> [!CAUTION]
227+
> Unfortunately, the NRBF format makes it easy for an attacker to compress a large number of null array items. That's why it's recommended to always check the total length of the array before calling <xref:System.Formats.Nrbf.ArrayRecord.GetArray*>. Moreover, <xref:System.Formats.Nrbf.ArrayRecord.GetArray*> accepts an optional `allowNulls` Boolean argument, which, when set to `false`, will throw for nulls.
228+
188229
[NrbfDecoder] does not load or instantiate any custom types, so in case of arrays of complex types, it returns an array of <xref:System.Formats.Nrbf.SerializationRecord>.
189230

190231
```csharp
@@ -195,14 +236,18 @@ public class ComplexType3D
195236
}
196237

197238
ArrayRecord arrayRecord = (ArrayRecord)NrbfDecoder.Decode(payload);
198-
SerializationRecord[] records = (SerializationRecord[])arrayRecord.GetArray(expectedArrayType: typeof(ComplexType3D[]));
239+
if (arrayRecord.Rank != 1 || arrayRecord.Lengths[0] > 10_000)
240+
{
241+
throw new Exception("The array had unexpected rank or length!");
242+
}
243+
244+
SerializationRecord[] records = (SerializationRecord[])arrayRecord.GetArray(expectedArrayType: typeof(ComplexType3D[]), allowNulls: false);
199245
ComplexType3D[] output = records.OfType<ClassRecord>().Select(classRecord => new ComplexType3D()
200246
{
201247
I = classRecord.GetInt32(nameof(ComplexType3D.I)),
202248
J = classRecord.GetInt32(nameof(ComplexType3D.J)),
203249
K = classRecord.GetInt32(nameof(ComplexType3D.K)),
204250
}).ToArray();
205-
206251
```
207252

208253
.NET Framework supported non-zero indexed arrays within NRBF payloads, but this support was never ported to .NET (Core). [NrbfDecoder] therefore does not support decoding non-zero indexed arrays.

0 commit comments

Comments
 (0)