Skip to content

Commit 13af0f4

Browse files
committed
Only read bytes required to complete CBOR string
In CBOR a string data item's initial byte encodes the string's length in number of bytes and not code points and the string is always encoded using UTF-8. Thus, the following comment in the code doesn't really apply, as surrogate pairs is a UTF-16 thing and the string's length has been provided in number of bytes anyway. ``` // 29-Jan-2021, tatu: as per [dataformats-binary#238] must keep in mind that // the longest individual unit is 4 bytes (surrogate pair) so we // actually need len+3 bytes to avoid bounds checks ```
1 parent 3eb1d43 commit 13af0f4

File tree

1 file changed

+6
-14
lines changed

1 file changed

+6
-14
lines changed

cbor/src/main/java/com/fasterxml/jackson/dataformat/cbor/CBORParser.java

Lines changed: 6 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2281,18 +2281,14 @@ protected void _finishToken() throws IOException
22812281
}
22822282
return;
22832283
}
2284-
// 29-Jan-2021, tatu: as per [dataformats-binary#238] must keep in mind that
2285-
// the longest individual unit is 4 bytes (surrogate pair) so we
2286-
// actually need len+3 bytes to avoid bounds checks
22872284
// 18-Jan-2024, tatu: For malicious input / Fuzzers, need to worry about overflow
22882285
// like Integer.MAX_VALUE
2289-
final int needed = Math.max(len, len + 3);
22902286
final int available = _inputEnd - _inputPtr;
22912287

2292-
if ((available >= needed)
2288+
if ((available >= len)
22932289
// if not, could we read? NOTE: we do not require it, just attempt to read
2294-
|| ((_inputBuffer.length >= needed)
2295-
&& _tryToLoadToHaveAtLeast(needed))) {
2290+
|| ((_inputBuffer.length >= len)
2291+
&& _tryToLoadToHaveAtLeast(len))) {
22962292
_finishShortText(len);
22972293
return;
22982294
}
@@ -2326,22 +2322,18 @@ protected String _finishTextToken(int ch) throws IOException
23262322
_finishChunkedText();
23272323
return _textBuffer.contentsAsString();
23282324
}
2329-
// 29-Jan-2021, tatu: as per [dataformats-binary#238] must keep in mind that
2330-
// the longest individual unit is 4 bytes (surrogate pair) so we
2331-
// actually need len+3 bytes to avoid bounds checks
23322325

23332326
// 19-Mar-2021, tatu: [dataformats-binary#259] shows the case where length
23342327
// we get is Integer.MAX_VALUE, leading to overflow. Could change values
23352328
// to longs but simpler to truncate "needed" (will never pass following test
23362329
// due to inputBuffer never being even close to that big).
23372330

2338-
final int needed = Math.max(len + 3, len);
23392331
final int available = _inputEnd - _inputPtr;
23402332

2341-
if ((available >= needed)
2333+
if ((available >= len)
23422334
// if not, could we read? NOTE: we do not require it, just attempt to read
2343-
|| ((_inputBuffer.length >= needed)
2344-
&& _tryToLoadToHaveAtLeast(needed))) {
2335+
|| ((_inputBuffer.length >= len)
2336+
&& _tryToLoadToHaveAtLeast(len))) {
23452337
return _finishShortText(len);
23462338
}
23472339
// If not enough space, need handling similar to chunked

0 commit comments

Comments
 (0)