Skip to content
This repository was archived by the owner on Apr 24, 2025. It is now read-only.

Commit 0f58a98

Browse files
committed
Migrate to csv crate rewrite.
This commit resists the urge to refactor/rewrite xsv and ports it over to the new CSV API. It made a lot of things cleaner and even improved the performance of core commands like `count`, `sample`, `search`, `select` and `slice`. This also removes the last remaining (dubious) uses of `unsafe` within xsv. Benchmarks before/after: benchmark before after count 0.26s 175.05 MB/sec 0.11 413.76 MB/sec flatten 4.53s 10.04 MB/sec 4.54 10.02 MB/sec flatten_condensed 4.72s 9.64 MB/sec 4.45 10.22 MB/sec frequency 1.91s 23.82 MB/sec 1.82 25.00 MB/sec index 0.28s 162.54 MB/sec 0.12 379.28 MB/sec sample_10 0.43s 105.84 MB/sec 0.18 252.85 MB/sec sample_1000 0.44s 103.44 MB/sec 0.18 252.85 MB/sec sample_100000 0.50s 91.02 MB/sec 0.29 156.94 MB/sec search 0.59s 77.14 MB/sec 0.27 168.56 MB/sec select 0.41s 111.00 MB/sec 0.14 325.09 MB/sec sort 2.59s 17.57 MB/sec 2.18 20.87 MB/sec slice_one_middle 0.22s 206.88 MB/sec 0.08 568.92 MB/sec slice_one_middle_index 0.01s 4551.36 MB/sec 0.01 4551.36 MB/sec stats 1.26s 36.12 MB/sec 1.09 41.75 MB/sec stats_index 0.19s 239.54 MB/sec 0.15 303.42 MB/sec stats_everything 2.13s 21.36 MB/sec 1.94 23.46 MB/sec stats_everything_index 1.00s 45.51 MB/sec 0.93 48.93 MB/sec
1 parent bc5f456 commit 0f58a98

32 files changed

+919
-703
lines changed

.travis.yml

Lines changed: 7 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,3 @@
1-
#language: rust
2-
#rust:
3-
# - 1.9.0
4-
# - stable
5-
# - beta
6-
# - nightly
7-
#script:
8-
# - cargo build --verbose
9-
# - cargo doc
10-
# - cargo test --verbose
11-
# - if [ "$TRAVIS_RUST_VERSION" = "nightly" ]; then
12-
# cargo bench --verbose;
13-
# fi
14-
151
language: rust
162
cache: cargo
173

@@ -33,6 +19,13 @@ matrix:
3319
- os: linux
3420
rust: stable
3521
env: TARGET=x86_64-unknown-linux-musl
22+
# Minimum Rust supported channel.
23+
- os: linux
24+
rust: 1.15.0
25+
env: TARGET=x86_64-unknown-linux-gnu
26+
- os: linux
27+
rust: 1.15.0
28+
env: TARGET=x86_64-unknown-linux-musl
3629

3730
before_install:
3831
- export PATH="$PATH:$HOME/.cargo/bin"

BENCHMARKS.md

Lines changed: 19 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,27 @@ These benchmarks were run with
66
which is a random 1,000,000 row subset of the world city population dataset
77
from the [Data Science Toolkit](https://github.com/petewarden/dstkdata).
88

9-
These benchmarks were run on an Intel i3930K (6 CPUs, 12 threads) with 32GB of
10-
memory.
9+
These benchmarks were run on an Intel i7-6900K (8 CPUs, 16 threads) with 64GB
10+
of memory.
1111

1212
```
13-
count 0.28 seconds 162.54 MB/sec
14-
flatten 5.31 seconds 8.57 MB/sec
15-
flatten_condensed 5.39 seconds 8.44 MB/sec
16-
frequency 2.54 seconds 17.91 MB/sec
17-
index 0.27 seconds 168.56 MB/sec
18-
sample_10 0.47 seconds 96.83 MB/sec
19-
sample_1000 0.49 seconds 92.88 MB/sec
20-
sample_100000 0.62 seconds 73.40 MB/sec
21-
search 0.71 seconds 64.10 MB/sec
22-
select 0.47 seconds 96.83 MB/sec
23-
sort 3.36 seconds 13.54 MB/sec
24-
slice_one_middle 0.22 seconds 206.88 MB/sec
25-
slice_one_middle_index 0.01 seconds 4551.36 MB/sec
26-
stats 1.37 seconds 33.22 MB/sec
27-
stats_index 0.23 seconds 197.88 MB/sec
28-
stats_everything 3.90 seconds 11.67 MB/sec
29-
stats_everything_index 2.58 seconds 17.64 MB/sec
13+
count 0.11 seconds 413.76 MB/sec
14+
flatten 4.54 seconds 10.02 MB/sec
15+
flatten_condensed 4.45 seconds 10.22 MB/sec
16+
frequency 1.82 seconds 25.00 MB/sec
17+
index 0.12 seconds 379.28 MB/sec
18+
sample_10 0.18 seconds 252.85 MB/sec
19+
sample_1000 0.18 seconds 252.85 MB/sec
20+
sample_100000 0.29 seconds 156.94 MB/sec
21+
search 0.27 seconds 168.56 MB/sec
22+
select 0.14 seconds 325.09 MB/sec
23+
sort 2.18 seconds 20.87 MB/sec
24+
slice_one_middle 0.08 seconds 568.92 MB/sec
25+
slice_one_middle_index 0.01 seconds 4551.36 MB/sec
26+
stats 1.09 seconds 41.75 MB/sec
27+
stats_index 0.15 seconds 303.42 MB/sec
28+
stats_everything 1.94 seconds 23.46 MB/sec
29+
stats_everything_index 0.93 seconds 48.93 MB/sec
3030
```
3131

3232
### Details
@@ -39,4 +39,3 @@ The `count` command can be viewed as a sort of baseline of the fastest possible
3939
command that parses every record in CSV data.
4040

4141
The benchmarks that end with `_index` are run with indexing enabled.
42-

0 commit comments

Comments
 (0)