Skip to content

01mf02/jaq

Repository files navigation

jaq

Build status Crates.io Documentation Rust 1.69+

jaq (pronounced /ʒaːk/, like Jacques1) is a clone of the JSON data processing tool jq. It has a few features not present in jq, such as support for the data formats YAML, CBOR, TOML, and XML. jaq has an own manual. You can try jaq on the playground.

jaq is two things at a time:

  • A command-line program, jaq, that can be used as drop-in replacement for jq.
  • A library, jaq-core, that can be used to compile and run jq programs inside of Rust programs. Compared to the jq API, jaq-core can be safely used in multi-threaded environments and supports arbitrary data types beyond JSON.

jaq focuses on three goals:

  • Correctness: jaq aims to provide a more correct and predictable implementation of jq, while preserving compatibility with jq in most cases.
  • Performance: I created jaq originally because I was bothered by the long start-up time of jq 1.6, which amounts to about 50ms on my machine. This can be particularly seen when processing a large number of small files. Although the startup time has been vastly improved in jq 1.7, jaq is still faster than jq on many other benchmarks.
  • Simplicity: jaq aims to have a simple and small implementation, in order to reduce the potential for bugs and to facilitate contributions.

Installation

Binaries

You can download binaries for Linux, Mac, and Windows on the releases page. On a Linux system, you can download it using the following commands:

$ curl -fsSL https://github.com/01mf02/jaq/releases/latest/download/jaq-$(uname -m)-unknown-linux-musl -o jaq && chmod +x jaq

You may also install jaq using homebrew on macOS or Linux:

$ brew install jaq
$ brew install --HEAD jaq # latest development version

Packaging status

From Source

To compile jaq, you need a Rust toolchain. See https://rustup.rs/ for instructions.

Any of the following commands install jaq:

$ cargo install --locked jaq
$ cargo install --locked --git https://github.com/01mf02/jaq # latest development version

On my system, both commands place the executable at ~/.cargo/bin/jaq.

If you have cloned this repository, you can also build jaq by executing one of the commands in the cloned repository:

$ cargo build --release # places binary into target/release/jaq
$ cargo install --locked --path jaq # installs binary

jaq should work on any system supported by Rust. If it does not, please file an issue.

Performance

The following evaluation consists of several benchmarks that allow comparing the performance of jaq, jq, and gojq. The empty benchmark runs n times the filter empty with null input, serving to measure the startup time. The bf-fib benchmark runs a Brainfuck interpreter written in jq, interpreting a Brainfuck script that produces n Fibonacci numbers. The other benchmarks evaluate various filters with n as input; see bench.sh for details.

I generated the benchmark data with bench.sh target/release/jaq jq-1.8.1 gojq-0.12.17 | tee bench.json on a Linux system with an AMD Ryzen 5 5500U.2 I then processed the results with a "one-liner" (stretching the term and the line a bit):

jq -rs '.[] | "|`\(.name)`|\(.n)|" + ([.time[] | min | (.*1000|round)? // "N/A"] | min as $total_min | map(if . == $total_min then "**\(.)**" else "\(.)" end) | join("|"))' bench.json

(Of course, you can also use jaq here instead of jq.) Finally, I concatenated the table header with the output and piped it through pandoc -t gfm.

Table: Evaluation results in milliseconds ("N/A" if error or more than 10 seconds).

Benchmark n jaq-3.0 jq-1.8.1 gojq-0.12.18
empty 512 410 430 270
bf-fib 13 530 1030 530
defs 100000 50 N/A 960
upto 8192 0 440 450
reduce-update 16384 0 720 1320
reverse 1048576 30 440 300
sort 1048576 90 430 550
group-by 1048576 260 1790 1580
min-max 1048576 180 200 300
add 1048576 390 520 1290
kv 131072 110 120 280
kv-update 131072 130 440 550
kv-entries 131072 520 980 870
ex-implode 1048576 600 890 560
reduce 1048576 740 700 N/A
try-catch 1048576 220 200 390
repeat 1048576 170 610 500
from 1048576 340 730 550
last 1048576 20 150 170
pyramid 524288 300 250 470
tree-contains 23 90 830 230
tree-flatten 17 700 330 10
tree-update 17 430 910 1830
tree-paths 17 140 230 780
to-fromjson 65536 50 350 100
ack 7 570 490 540
range-prop 128 370 250 210
cumsum 1048576 240 250 510
cumsum-xy 1048576 380 350 750
str-slice 8192 170 650 120

The results show that jaq-3.0 is fastest on 20 benchmarks, whereas jq-1.8.1 is fastest on 5 benchmarks and gojq-0.12.18 is fastest on 6 benchmarks. gojq is much faster on tree-flatten because it implements the filter flatten natively instead of by definition.

Security

jaq tries to guarantee that:

  • It does not panic (except in cases of resource exhaustion, see below).
  • It does not corrupt memory, i.e. it is memory-safe.
  • It does not allow input data and jq filters to initiate I/O operations (except for reading files by jq filters before filter execution).

Any case where such a guarantee is broken is a bug and should be reported.

On the other hand, jaq does not take countermeasures against any kind of resource exhaustion. That means that jaq may take unlimited time, memory, or stack space. For example, this may lead to stack overflows when:

  • Reading input data: jaq -nr 'repeat("[")' | jaq
  • Running jq filters: jaq -n 'def f: 1+f; f'

jaq's core has been audited by Radically Open Security as part of an NLnet grant --- thanks to both organisations for their support! The security audit found one low severity issue and three issues that are likely not exploitable at all. As a result of this security audit, all issues were addressed and several fuzzing targets for jaq were added at jaq-core/fuzz. Before that, jaq's JSON parser hifijson already disposed of a fuzzing target. Finally, jaq disposes of a carefully crafted test suite of more than 500 tests that is checked at every commit.

User Testimonials

jaq is a well-built library that gave me a massive leg up compared to implementing jq support on my own. Extensibility through the ValT trait made adding jq support to my own types a breeze.

@jobarr-amzn (amazon-ion/ion-cli#193 (review), #355 (comment))

My Rust program [using jaq] can execute all queries over all files three times while Python is busy executing one query across all files using the jq PyPI crate and a Python loop.

@I-Al-Istannen (#323 (comment))

jaq is very impressive! Running my wsjq interpreter with it is significantly faster than with any other jq implementation and its emphasis on correctness is very admirable. [On wsjq benchmarks, jaq is between 5 and 10 times faster than jq and between 15 and 196 times faster than gojq.]

@thaliaarchi (#355 (comment))

I had been parsing data from certificate transparency logs using certstream-server. It gives a lot of data and piping it into jq was causing me issues. I switched to jaq and the faster startup time meant it could easily keep up on the low end VM I was using. Thank you for your work.

Oliver (via e-mail)

Add your own testimonials via #355.

Acknowledgements

This project was funded through the NGI0 Entrust and NGI0 Commons funds established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreements № 101069594 and № 101135429. Additional funding is made available by the Swiss State Secretariat for Education, Research and Innovation (SERI).

Footnotes

  1. I wanted to create a tool that should be discreet and obliging, like a good waiter. And when I think of a typical name for a (French) waiter, to my mind comes "Jacques". Later, I found out about the old French word jacquet, meaning "squirrel", which makes for a nice ex post inspiration for the name. And finally, the Jacquard machine was an important predecessor of the modern computer, automating weaving with punched cards as early as 1804.

  2. jq-1.8.1 was installed from the official Arch Linux package and gojq-0.12.18 was retrieved from its GitHub release page.

About

A jq clone focussed on correctness, speed, and simplicity

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Sponsor this project

 

Packages

 
 
 

Contributors

Languages