High-performance, fast compiling, span preserving toml parsing for rust.
Orginally forked from toml-span to add TOML 1.1.0 support, toml-spanner
has received significant performance improvements and reductions in compile time.
Like the original toml-span temporal values such as timestamps or local times are not supported.
use toml_spanner::{Deserialize, Error, Item};
#[derive(Debug)]
struct Things {
name: String,
value: u32,
color: Option<String>,
}
impl<'de> Deserialize<'de> for Things {
fn deserialize(value: &mut Item<'de>) -> Result<Self, Error> {
let table = value.expect_table()?;
Ok(Things {
name: table.required("name")?,
value: table.required("value")?,
color: table.optional("color")?,
})
}
}
struct Config {
things: Vec<Things>,
dev_mode: bool,
}
impl Config {
pub fn parse(content: &str) -> Result<Config, Error> {
let arena = toml_spanner::Arena::new();
let mut table = toml_spanner::parse(content, &arena)?;
let config = Config {
things: table.required("things")?,
dev_mode: table.optional("dev-mode")?.unwrap_or(false),
};
// Report unexpected fields
table.expect_empty()?;
Ok(config)
}
}
fn main() {
let content = r#"
dev-mode = true
[[things]]
name = "hammer"
value = 43
[[things]]
name = "drill"
value = 300
color = "green"
"#;
match Config::parse(content) {
Ok(config) => {
println!("dev_mode: {}", config.dev_mode);
for thing in config.things {
println!("thing: {:?}", thing);
}
}
Err(e) => eprintln!("Error parsing config: {e}"),
}
}Measured on AMD Ryzen 9 5950X, 64GB RAM, Linux 6.18, rustc 1.93.0. Relative parse time across real-world TOML files (lower is better):
Crate Versions: toml-spanner = 0.3.0, toml = 1.0.2+spec-1.1.0, toml-span = 0.7.0
time(μs) cycles(K) instr(K) branch(K)
zed/Cargo.toml
toml-spanner 25.1 119 441 92
toml 257.2 1220 3084 607
toml-span 381.6 1816 5048 1046
extask.toml
toml-spanner 8.9 42 148 29
toml 78.7 376 1002 192
toml-span 105.0 500 1335 263
devsm.toml
toml-spanner 3.6 17 68 15
toml 32.3 155 422 80
toml-span 55.0 262 713 141
Extra cargo build --release time for binaries using the respective crates (lower is better):
median(ms) added(ms)
null 99
toml-spanner 655 +556
toml-span 1375 +1276
toml 3027 +2928
toml+serde 5037 +4938
Checkout ./benchmark for more details, but numbers should simulate the additional
time added users would experience during source based installs such as via cargo install.
While toml-spanner star
ted as a fork of toml-span, it has since undergone
extensive changes:
-
10x faster than
toml-span, and 5-8x faster thantomlacross real-world workloads. -
Preserved index order: tables retain their insertion order by default, unlike
toml_spanand the default mode oftoml. -
Compact
Valuetype (on 64bit platforms):Crate Value/Item TableEntry toml-spanner 24 bytes 48 bytes toml-span 48 bytes 88 bytes toml 32 bytes 56 bytes toml (preserve_order) 80 bytes 104 bytes Note that the
tomlcrateValuetype doesn't contain any span information and thattoml-spandoesn't support table entry order preservation.
toml-spanner makes extensive use of unsafe code to achieve its performance
and size goals. This is mitigated by fuzzing and running the test suite under
Miri.
The unsafe in this crate demands thorough testing. The full suite includes
Miri for detecting undefined behavior,
fuzzing against the reference toml crate, and snapshot-based integration
tests.
cargo test --workspace # all tests
cargo test -p snapshot-tests # integration tests only
cargo +nightly miri nextest run # undefined behavior checks
cargo +nightly fuzz run parse_compare_toml # fuzz against the toml crate
cargo +nightly fuzz run parse_value # fuzz the parser directly
# Test 32bit support under MIRI
cargo +nightly miri nextes -p toml-spanner --target i686-unknown-linux-gnuIntegration tests use insta for snapshot assertions.
Run cargo insta test -p snapshot-tests and cargo insta review to review
changes.
Code coverage:
cargo +nightly llvm-cov --branch --show-missing-lines -- -qFirst off I just want to be up front and clear about the differences/limitations of this crate versus toml
- No
serdesupport for deserialization, there is aserdefeature, but that only enables serialization of theValueandSpannedtypes. - No toml serialization. This crate is only intended to be a span preserving deserializer, there is no intention to provide serialization to toml, especially the advanced format preserving kind provided by
toml-edit. - No datetime deserialization. It would be trivial to add support for this (behind an optional feature), I just have no use for it at the moment. PRs welcome.
This contribution is dual licensed under EITHER OF
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.