Implement document length tracking for DataInput-backed JSON parsers via subclass by pjfanning · Pull Request #1575 · FasterXML/jackson-core

pjfanning · 2026-03-11T23:58:59Z

One possible way to support max document len on DataInput use case.
The count and validation overhead only kicks in if you set a non-negative max document len - which is not the default case.

…fast from JsonFactory Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

…c-length tracking Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

…via subclass Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

cowtowncoder · 2026-03-12T00:03:24Z

I would want to see performance testing (esp. for non-validating case) since this seems it could wreck performance (overrides and indirection) -- f.ex with https://github.com/FasterXML/jackson-benchmarks/ .

pjfanning · 2026-03-12T11:28:43Z

@cowtowncoder I'll put together a benchmark but would expect it to have quite an impact. The parser itself is already not very efficient becuase it reads byte by byte and I would recommend that users try to get an InputStream instead of a DataInput instance to parse if it at all possible.

The normal case of no max doc len will not be affect noticeably by this change
This change will allow users to at least parse DataInput inputs again if they do set a max doc len (because recent code changes cause this code path to throw an exception - deliberately)
One option would be to add a JsonReadFeature that controls whether the max doc len is set case fails deliberately or uses this inefficient approach.

cowtowncoder · 2026-03-12T15:41:17Z

The normal case of no max doc len will not be affect noticeably by this change

I would not assume that without measurements, due to inlining effects.
Depends on how JVM handles optimizations I suppose & benchmarking can give answer.

As to failing, configurability -- while possible, I don't think it's necessary at this point.

pjfanning · 2026-03-13T11:50:14Z

@cowtowncoder I created a benchmark suite and compared the v2.21.1 parser with one that has the overridable shared readUnsignedByte method and a 3rd parser that applies the size limit.

The results for all 3 parsers are almost the same. So close that I nearly suspect that I have a mistake in the test suite.

I tested with different JSON sizes and Java versions and in no run was there any significant diff in perf between the 3 parsers.

I'll double check my test suite logic over the next few days.

https://github.com/pjfanning/jackson-bench/blob/main/src/jmh/java/org/example/jackson/DataInputBench.java

pjfanning · 2026-03-16T19:17:47Z

@cowtowncoder this change seems to work ok. I see a small overhead when the max doc len is enabled but it is only a few percent.

One example run (Limited is the benchmark with the check enabled)

Benchmark                              Mode  Cnt     Score     Error  Units
DataInputBench.benchDataInput         thrpt    5  5176.015 ± 574.007  ops/s
DataInputBench.benchDataInputLimited  thrpt    5  5027.761 ± 319.194  ops/s
DataInputBench.benchDataInputNew      thrpt    5  5299.451 ± 130.896  ops/s

cowtowncoder · 2026-03-16T20:08:09Z

@pjfanning Sounds good so far. Could you also run test against unchanged build from 3.x (or official 3.1.0 which should be about the same)?

pjfanning · 2026-03-16T20:30:53Z

@cowtowncoder You want to create a new benchmark that uses Jackson 3 instead of Jackson 2?

cowtowncoder · 2026-03-16T20:33:41Z

@cowtowncoder You want to create a new benchmark that uses Jackson 3 instead of Jackson 2?

I thought this PR is for 3.x? Wasn't thinking we implement this functionality for 2.x, only 3.x

But if we are talking about changes 2.x, comparison to unmodified version (2.21.1 would be fine) vs new versions.

pjfanning · 2026-03-18T09:12:35Z

The jackson3 based benchmarks behave similarly to the jackson2 ones.

Benchmark                                      Mode  Cnt     Score     Error  Units
DataInputJackson3Bench.benchDataInput         thrpt    5  1969.188 ± 297.992  ops/s
DataInputJackson3Bench.benchDataInputLimited  thrpt    5  1908.798 ± 106.280  ops/s
DataInputJackson3Bench.benchDataInputNew      thrpt    5  2037.195 ±  66.618  ops/s

cowtowncoder · 2026-03-18T16:52:53Z

@pjfanning Does "benchDataInput" refer to pre-changes version from 3.x, and "benchDataInputLimited" / "benchDataInputNew" to modified version with and withou limitations?

If so, differences do indeed look insignificant.

pjfanning · 2026-03-18T16:54:27Z

@pjfanning Does "benchDataInput" refer to pre-changes version from 3.x, and "benchDataInputLimited" / "benchDataInputNew" to modified version with and withou limitations?

If so, differences do indeed look insignificant.

@cowtowncoder your interpretation of the names is correct

cowtowncoder · 2026-03-22T18:15:26Z

@pjfanning I confirmed your findings wrt baseline using

https://github.com/FasterXML/jackson-benchmarks/

and so I'm confident there is no performance degradation on JDK 17.

The only odd part that I now recall -- unrelated to this change -- is that JDK 8 runs DataInput test almost 3 (!) times faster (throughput of 320k vs 110k). That does not really matter here, just thought it odd how DataInput performance is so bad for post-Java-8.

I'll now review this PR and hopefully get it merged.
Thank you for your patience @pjfanning !

…`nextToken()`

github-actions · 2026-03-22T18:48:53Z

📈 Overall Code Coverage

Metric	Coverage	Change
Instructions	83.08%	📈 +0.010%
Branches	75.73%	📈 +0.000%

Overall project coverage from JaCoCo test results. Change values compare against the latest base branch build.

cowtowncoder · 2026-03-22T19:11:10Z

@pjfanning Thank you again! It's good that performance for the use case of no constraints use is unchanged -- and likely even use with constraints is not drastically worse.
This will be in 3.2.0; earlier LTSes can just use #1570 to block use of constraints for unlikely case of DataInput as source.

Copilot AI and others added 4 commits March 12, 2026 00:57

Initial plan

8130661

Add document length tracking to UTF8DataInputJsonParser, remove fail-…

46ab19c

…fast from JsonFactory Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

Refactor: create UTF8DataInputWithDocLengthJsonParser subclass for do…

990935c

…c-length tracking Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

Implement document length tracking for DataInput-backed JSON parsers …

cf2be62

…via subclass Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

Merge branch '3.x' into copilot/update-utf8datainputjsonparser

1067137

Merge branch '3.x' into copilot/update-utf8datainputjsonparser

66a3f42

cowtowncoder added 4 commits March 22, 2026 11:24

Add release notes, @SInCE, change visibility

1280b1c

Remove redundant override as nextName() in base class delegates to …

fd85c43

…`nextToken()`

Comment improvements

e4bc5d0

Remove release entry wrt now obsolete FasterXML#1570 change

c64f295

cowtowncoder mentioned this pull request Mar 22, 2026

Fail parsing from DataInput if StreamReadConstraints.getMaxDocumentLength() set [GHSA-2m67-wjpj-xhg9] #1570

Closed

cowtowncoder approved these changes Mar 22, 2026

View reviewed changes

cowtowncoder merged commit 701cd6a into FasterXML:3.x Mar 22, 2026
6 checks passed

Uh oh!

Conversation

pjfanning commented Mar 11, 2026

Uh oh!

cowtowncoder commented Mar 12, 2026

Uh oh!

pjfanning commented Mar 12, 2026

Uh oh!

cowtowncoder commented Mar 12, 2026

Uh oh!

pjfanning commented Mar 13, 2026

Uh oh!

pjfanning commented Mar 16, 2026

Uh oh!

cowtowncoder commented Mar 16, 2026

Uh oh!

pjfanning commented Mar 16, 2026

Uh oh!

cowtowncoder commented Mar 16, 2026

Uh oh!

pjfanning commented Mar 18, 2026

Uh oh!

cowtowncoder commented Mar 18, 2026

Uh oh!

pjfanning commented Mar 18, 2026

Uh oh!

cowtowncoder commented Mar 22, 2026

Uh oh!

github-actions bot commented Mar 22, 2026

📈 Overall Code Coverage

Uh oh!

Uh oh!

cowtowncoder commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants