Skip to content

Sheet Lazy Loading Implementation#1741

Merged
tonyqus merged 14 commits intonissl-lab:masterfrom
tonyqus:sheet_loading_optimization
Mar 21, 2026
Merged

Sheet Lazy Loading Implementation#1741
tonyqus merged 14 commits intonissl-lab:masterfrom
tonyqus:sheet_loading_optimization

Conversation

@tonyqus
Copy link
Member

@tonyqus tonyqus commented Mar 21, 2026

Summary

BenchmarkDotNet v0.13.12, Windows 11 (10.0.26200.8037)
12th Gen Intel Core i5-12400, 1 CPU, 12 logical and 6 physical cores
.NET SDK 10.0.201
[Host] : .NET 8.0.25 (8.0.2526.11203), X64 RyuJIT AVX2
ShortRun : .NET 8.0.25 (8.0.2526.11203), X64 RyuJIT AVX2

Job=ShortRun IterationCount=3 LaunchCount=1
WarmupCount=3

Method Mean Error StdDev Gen0 Gen1 Gen2 Allocated
XSSFWorkbookLoad_DisableSheetLazyLoading 20,875.396 ms 22,536.529 ms 1,235.3032 ms 587000.0000 491000.0000 14000.0000 5355942.48 KB
XSSFWorkbookLoad_EnableSheetLazyLoading 2,443.219 ms 5,871.220 ms 321.8214 ms 70000.0000 69000.0000 2000.0000 698898.02 KB
XSSFReaderLoad 1.062 ms 1.595 ms 0.0874 ms 91.7969 29.2969 - 852.25 KB

Copilot AI and others added 10 commits March 20, 2026 12:21
- Add _worksheetLoaded, _loadLock, and _parseCount fields
- OnDocumentRead() defers parsing; lazy-loads on first access
- OnDocumentCreate() marks sheet as loaded immediately
- Add EnsureWorksheetLoaded() with double-checked locking
- Override PrepareForCommit()/Commit() to skip unloaded sheets
- Add EnsureWorksheetLoaded() call to all public/internal methods
  and property getters/setters that access worksheet data

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Set _worksheetLoaded = true before calling Read() to prevent
infinite recursion when Read() triggers callbacks (XSSFRow
constructor calls OnReadCell(), LastRowNum, PhysicalNumberOfRows)
that themselves call EnsureWorksheetLoaded().

Reset _worksheetLoaded = false on exception to allow retry.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests verify that:
- Opening a workbook does not parse sheet XML
- GetSheetAt/GetSheet alone do not trigger parsing
- First content access triggers parse exactly once
- Subsequent accesses do not re-parse (_parseCount stays at 1)
- Newly created in-memory sheets work without parsing
- Data correctness after lazy load
- Multiple sheets are independently lazy-loaded

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: tonyqus <772561+tonyqus@users.noreply.github.com>
…f-sheets

Implement lazy loading for XSSFSheet — defer sheet.xml parsing to first content access
@tonyqus tonyqus added this to the NPOI 2.8.0 milestone Mar 21, 2026
@tonyqus tonyqus changed the title Sheet loading optimization Sheet Lazy Loading Implementation Mar 21, 2026
@tonyqus tonyqus merged commit d4d184e into nissl-lab:master Mar 21, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants