You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Revised performance benchmarks and descriptions to reflect updated test results and improved clarity. Enhanced language to better highlight speed improvements and the focus of the new CSV library for database import workflows.
This post is about a pretty big update to the CSV import (and now export!) capabilities in dbatools. If you've used [Import-DbaCsv](https://dbatools.io/Import-DbaCsv), you've been using the LumenWorks CSV library under the hood for years. It's been rock solid and I've sung its praises many times. But LumenWorks was last updated [7-8 years ago](https://github.com/phatcher/CsvReader), and .NET has come a *long* way since then.
12
+
This post is about a huge update to the CSV import (and now export!) capabilities in dbatools. If you've used [Import-DbaCsv](https://dbatools.io/Import-DbaCsv), you've been using the LumenWorks CSV library under the hood for years. It's been rock solid and I've sung its praises many times. But LumenWorks was last updated [7-8 years ago](https://github.com/phatcher/CsvReader), and .NET has come a *long* way since then.
13
13
14
14
I've been using [Claude Code](https://claude.ai/code) for various projects and had a Max 20x account when Anthropic announced they'd be pretty much giving away Opus 4.5 for a week. Opus is known for its exceptional quality when it comes to software architecture so this is a PERFECT time to use its ultra big brain to rewrite the CSV library!
15
15
@@ -27,7 +27,7 @@ I asked Claude to create a replacement for LumenWorks that takes advantage of mo
27
27
28
28
> Create a replacement for LumenWorks.Framework.IO.dll PLUS the additional functionality requested in dbatools issues on GitHub. This library was written over a decade ago. Considering the advances in .NET and SqlClient, please add a CSV reader of better quality (more functionality often seen in paid systems, faster) using recent .NET and Microsoft Data best practices.
29
29
30
-
What came back was fast as heck and used several patterns (apparently `Span<T>`, `ArrayPool`, along with proper async) that simply didn't exist when LumenWorks was written. I'm a PowerShell developer so that doesn't mean much to me other than I love the speed.
30
+
What came back was fast as 🔥 and used several patterns (apparently `Span<T>`, `ArrayPool`, along with proper async) that simply didn't exist when LumenWorks was written. I'm a PowerShell developer so that doesn't mean much to me other than I love the speed.
31
31
32
32
## The results
33
33
@@ -41,27 +41,27 @@ Here's the interesting thing: performance varies dramatically depending on how y
41
41
42
42
| Library | Time (ms) | vs Dataplat |
43
43
|---------|-----------|-------------|
44
-
| Sep |19 ms | 3.8x faster |
45
-
| Sylvan |29 ms | 2.5x faster |
46
-
|**Dataplat**|**74 ms**|**baseline**|
47
-
| CsvHelper | 76 ms |~same|
48
-
| LumenWorks |433 ms |**5.9x slower**|
44
+
| Sep |18 ms | 3.7x faster |
45
+
| Sylvan |27 ms | 2.5x faster |
46
+
|**Dataplat**|**67 ms**|**baseline**|
47
+
| CsvHelper | 76 ms |1.1x slower|
48
+
| LumenWorks |395 ms |**5.9x slower**|
49
49
50
50
**All columns read (full row processing):**
51
51
52
52
| Library | Time (ms) | vs Dataplat |
53
53
|---------|-----------|-------------|
54
-
| Sep |35 ms |2.1x faster |
55
-
| Sylvan |37 ms |2.0x faster |
56
-
|**Dataplat**|**73 ms**|**baseline**|
57
-
| CsvHelper |101 ms | 1.4x slower |
58
-
| LumenWorks |100 ms | 1.4x slower |
54
+
| Sep |30 ms |1.8x faster |
55
+
| Sylvan |35 ms |1.6x faster |
56
+
|**Dataplat**|**55 ms**|**baseline**|
57
+
| CsvHelper |97 ms | 1.8x slower |
58
+
| LumenWorks |102 ms | 1.9x slower |
59
59
60
-
For the single-column pattern (which is how SqlBulkCopy typically reads data), Dataplat is **~6x faster** than LumenWorks! For full row processing, we're still **~1.4x faster**.
60
+
For the single-column pattern (which is how SqlBulkCopy typically reads data), Dataplat is **~6x faster** than LumenWorks! For full row processing, we're **~1.9x faster**.
61
61
62
62
### Where we stand in 2025
63
63
64
-
Being honest: if pure parsing speed is your only concern, [Sep](https://github.com/nietras/Sep/) is faster. Sep can hit 21 GB/s with AVX-512 SIMD. But our library isn't trying to be Sep. We're built for **database import workflows** where you need:
64
+
Being honest, if pure parsing speed is your only concern, [Sep](https://github.com/nietras/Sep/) is faster. Sep can hit an insane 21 GB/s with AVX-512 SIMD. But our library isn't trying to be Sep. We're built for **database import workflows** where you need:
65
65
66
66
-**IDataReader interface** - Stream directly to SqlBulkCopy without intermediate allocations
67
67
-**Built-in compression** - Import `.csv.gz` files without extracting first
@@ -209,7 +209,7 @@ RowsPerSecond : 58327.1
209
209
210
210
## Standalone NuGet package
211
211
212
-
If you're a .NET developer and want to use this outside of PowerShell, the CSV library is available as a standalone NuGet package. Check out the [landing page](https://dataplat.dbatools.io/csv) for a quick overview of features and benchmarks.
212
+
If you're a .NET developer and want to use this outside of PowerShell, the CSV library is available as a standalone NuGet package. Check out the gorrrrgeous [landing page](https://dataplat.dbatools.io/csv) for a quick overview of features and benchmarks.
0 commit comments