Implement lazy loading for xl/styles.xml in XSSFWorkbook#1731
Merged
Implement lazy loading for xl/styles.xml in XSSFWorkbook#1731
Conversation
- StylesTable(PackagePart part) constructor is now lazy: doesn't parse XML immediately. Parsing is deferred until first access via EnsureLoaded(). - Added _isLoaded and _isTouched flags to track state. - Overrode PrepareForCommit() and Commit() to be no-ops when untouched, preserving original xl/styles.xml bytes in the output package. - Added EnsureLoaded() calls to all data-accessing methods. - Added EnsureLoaded() + MarkTouched() calls to all mutation methods. - Modified Theme setter to only iterate fonts/borders if model is loaded; EnsureLoaded() applies the theme after parsing. - Added 3 tests verifying: untouched preservation, touched modification, and basic workbook operations with lazy styles." Co-authored-by: tonyqus <772561+tonyqus@users.noreply.github.com>
…tion order, improve comments Co-authored-by: tonyqus <772561+tonyqus@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Implement lazy loading for xl/styles.xml in XSSFWorkbook
Implement lazy loading for xl/styles.xml in XSSFWorkbook
Mar 13, 2026
Member
|
BenchmarkDotNet v0.13.12, Windows 11 (10.0.26200.7922) Job=ShortRun IterationCount=3 LaunchCount=1 Before This PR
After This PR
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently, opening any
.xlsxfile eagerly parsesStylesTableand, on every save, clears and rewritesxl/styles.xmleven when styles were never touched — causing byte-level diffs in round-trips with no style mutations.Changes
StylesTable.cs— lazy parse + dirty trackingStylesTable(PackagePart part)no longer callsReadFrom()in the constructor; XML parsing is deferred via a newEnsureLoaded()method_isLoaded— set on first access; triggersReadFrom()+ theme propagation_isTouched— set by any mutation; controls whetherCommit()firesPrepareForCommit()andCommit()are no-ops when_isTouched = false, so the originalZipPackagePartbytes pass through unchanged to the output ZIPtrue— no behavior change for new filesEnsureLoaded(); all write methods call bothEnsureLoaded()andMarkTouched()Themesetter no longer iterates fonts/borders when the model is not yet loaded;EnsureLoaded()applies the theme post-parseTestXSSFWorkbook.cs— three new testsLazyStyles_UntouchedStylesPreservedOnSave— asserts byte-identicalxl/styles.xmlafter a no-style-touch round-tripLazyStyles_TouchedStylesModifiedOnSave— assertsxl/styles.xmldiffers afterCreateFont(), and the result reloads correctlyLazyStyles_BasicOperationsWorkWithoutForcingStylesLoad— smoke test for sheet/cell access andGetStylesSource()under lazy modeOriginal prompt
Goal
Implement lazy loading for
xl/styles.xmlinXSSFWorkbookand ensure that if styles are never touched, saving the workbook does not re-serialize / commit a regeneratedstyles.xmlinto the package part, but instead preserves the originalstyles.xmlbytes/stream from the source file.Repo:
nissl-lab/npoiBackground / Current Behavior
XSSFWorkbook.OnDocumentRead()currently discovers aStylesTableviaRelationPartsand assigns it tostylesSourceeagerly. (Seeooxml/XSSF/UserModel/XSSFWorkbook.csaroundOnDocumentRead()wherestylesSourceis set whenp is StylesTable.)POIXMLDocumentPart.OnSave()callsPrepareForCommit()which clears thePackagePart, thenCommit()which writes XML. This causesstyles.xmlto be rewritten if it is part of the document tree.Required Behavior
Lazy-load styles
StylesTableon workbook open.GetStylesSource()is called or any API that requires styles is invoked).Preserve original
styles.xmlwhen styles are untouchedxl/styles.xmland the user never touches styles:styles.xmlbytes must remain unchanged in the output.Normal behavior when styles are touched
StylesTableand commit it on save as before.Implementation Details (Guidance)
A) XSSFWorkbook changes
Add fields to
XSSFWorkbookto support lazy behavior, e.g.:PackagePartforstyles.xml(if present)stylesLoaded,stylesTouchedDuring
OnDocumentRead():StylesTableinstance.PackagePart(and relationship id if needed) for later.Update
GetStylesSource():stylesSourcenot yet loaded and a stylesPackagePartexists, parse it to aStylesTableon demand.StylesTable(but only when styles are first requested).Ensure all internal places that need styles go through
GetStylesSource().B) Prevent styles part from being committed if untouched
POIXMLDocumentPartrelation tree orPossible approaches:
stylesSourceasnulland do not include a liveStylesTablepart in relations when untouched.C) Dirty tracking
stylesTouched = truewhen APIs are used that create/modify styles:CreateCellStyle,CreateFont, number format creation, custom styles, etc.D) Tests
Add/extend tests under
testcases/ooxmlto cover:styles.xml, do not touch styles, write out, and assertstyles.xmlbytes are identical.styles.xmlis modified (or at least not identical).Use deterministic fixtures (add a small xlsx test file if needed) and read the
xl/styles.xmlentry from the written package for comparison.Notes
.xlsxand.xlsmworkbooks.The following is the prior conversation context from the user's chat exploration (may be truncated):
User: Implement lazy loading for styles.xml in XSSFWorkbook and if styles are not touched, don't commit the styles to packagepart, use existing styles.xml stream instead
Assistant: To do this in
XSSFWorkbook, you need two related changes:styles.xml(don’t parse it into aStylesTableunless a caller actually needs styles)styles.xmlinto the package; instead preserve the originalstyles.xmlbytes/stream from the source package.Below is a concrete design that fits the current code structure shown in the repository.
What the current code does (relevant facts from repo)
XSSFWorkbook.OnDocumentRead()currently walksRelationPartsand assignsstylesSourcewhen it encounters aStylesTablepart:else if (p is StylesTable stylesTable) stylesSource = stylesTable;citeturn8commentary0This pull request was created from Copilot chat.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.