Skip to content

Commit d3e572d

Browse files
Update CSV library blog post with new benchmarks
Revised performance benchmarks and descriptions to reflect updated test results and improved clarity. Enhanced language to better highlight speed improvements and the focus of the new CSV library for database import workflows.
1 parent d05497d commit d3e572d

File tree

1 file changed

+15
-15
lines changed

1 file changed

+15
-15
lines changed

content/post/new-csv-library.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ draft: true
99
images: ["https://dataplat.dbatools.io/csv-social.png"]
1010
---
1111

12-
This post is about a pretty big update to the CSV import (and now export!) capabilities in dbatools. If you've used [Import-DbaCsv](https://dbatools.io/Import-DbaCsv), you've been using the LumenWorks CSV library under the hood for years. It's been rock solid and I've sung its praises many times. But LumenWorks was last updated [7-8 years ago](https://github.com/phatcher/CsvReader), and .NET has come a *long* way since then.
12+
This post is about a huge update to the CSV import (and now export!) capabilities in dbatools. If you've used [Import-DbaCsv](https://dbatools.io/Import-DbaCsv), you've been using the LumenWorks CSV library under the hood for years. It's been rock solid and I've sung its praises many times. But LumenWorks was last updated [7-8 years ago](https://github.com/phatcher/CsvReader), and .NET has come a *long* way since then.
1313

1414
I've been using [Claude Code](https://claude.ai/code) for various projects and had a Max 20x account when Anthropic announced they'd be pretty much giving away Opus 4.5 for a week. Opus is known for its exceptional quality when it comes to software architecture so this is a PERFECT time to use its ultra big brain to rewrite the CSV library!
1515

@@ -27,7 +27,7 @@ I asked Claude to create a replacement for LumenWorks that takes advantage of mo
2727

2828
> Create a replacement for LumenWorks.Framework.IO.dll PLUS the additional functionality requested in dbatools issues on GitHub. This library was written over a decade ago. Considering the advances in .NET and SqlClient, please add a CSV reader of better quality (more functionality often seen in paid systems, faster) using recent .NET and Microsoft Data best practices.
2929
30-
What came back was fast as heck and used several patterns (apparently `Span<T>`, `ArrayPool`, along with proper async) that simply didn't exist when LumenWorks was written. I'm a PowerShell developer so that doesn't mean much to me other than I love the speed.
30+
What came back was fast as 🔥 and used several patterns (apparently `Span<T>`, `ArrayPool`, along with proper async) that simply didn't exist when LumenWorks was written. I'm a PowerShell developer so that doesn't mean much to me other than I love the speed.
3131

3232
## The results
3333

@@ -41,27 +41,27 @@ Here's the interesting thing: performance varies dramatically depending on how y
4141

4242
| Library | Time (ms) | vs Dataplat |
4343
|---------|-----------|-------------|
44-
| Sep | 19 ms | 3.8x faster |
45-
| Sylvan | 29 ms | 2.5x faster |
46-
| **Dataplat** | **74 ms** | **baseline** |
47-
| CsvHelper | 76 ms | ~same |
48-
| LumenWorks | 433 ms | **5.9x slower** |
44+
| Sep | 18 ms | 3.7x faster |
45+
| Sylvan | 27 ms | 2.5x faster |
46+
| **Dataplat** | **67 ms** | **baseline** |
47+
| CsvHelper | 76 ms | 1.1x slower |
48+
| LumenWorks | 395 ms | **5.9x slower** |
4949

5050
**All columns read (full row processing):**
5151

5252
| Library | Time (ms) | vs Dataplat |
5353
|---------|-----------|-------------|
54-
| Sep | 35 ms | 2.1x faster |
55-
| Sylvan | 37 ms | 2.0x faster |
56-
| **Dataplat** | **73 ms** | **baseline** |
57-
| CsvHelper | 101 ms | 1.4x slower |
58-
| LumenWorks | 100 ms | 1.4x slower |
54+
| Sep | 30 ms | 1.8x faster |
55+
| Sylvan | 35 ms | 1.6x faster |
56+
| **Dataplat** | **55 ms** | **baseline** |
57+
| CsvHelper | 97 ms | 1.8x slower |
58+
| LumenWorks | 102 ms | 1.9x slower |
5959

60-
For the single-column pattern (which is how SqlBulkCopy typically reads data), Dataplat is **~6x faster** than LumenWorks! For full row processing, we're still **~1.4x faster**.
60+
For the single-column pattern (which is how SqlBulkCopy typically reads data), Dataplat is **~6x faster** than LumenWorks! For full row processing, we're **~1.9x faster**.
6161

6262
### Where we stand in 2025
6363

64-
Being honest: if pure parsing speed is your only concern, [Sep](https://github.com/nietras/Sep/) is faster. Sep can hit 21 GB/s with AVX-512 SIMD. But our library isn't trying to be Sep. We're built for **database import workflows** where you need:
64+
Being honest, if pure parsing speed is your only concern, [Sep](https://github.com/nietras/Sep/) is faster. Sep can hit an insane 21 GB/s with AVX-512 SIMD. But our library isn't trying to be Sep. We're built for **database import workflows** where you need:
6565

6666
- **IDataReader interface** - Stream directly to SqlBulkCopy without intermediate allocations
6767
- **Built-in compression** - Import `.csv.gz` files without extracting first
@@ -209,7 +209,7 @@ RowsPerSecond : 58327.1
209209

210210
## Standalone NuGet package
211211

212-
If you're a .NET developer and want to use this outside of PowerShell, the CSV library is available as a standalone NuGet package. Check out the [landing page](https://dataplat.dbatools.io/csv) for a quick overview of features and benchmarks.
212+
If you're a .NET developer and want to use this outside of PowerShell, the CSV library is available as a standalone NuGet package. Check out the gorrrrgeous [landing page](https://dataplat.dbatools.io/csv) for a quick overview of features and benchmarks.
213213

214214
```bash
215215
dotnet add package Dataplat.Dbatools.Csv

0 commit comments

Comments
 (0)