Skip to content

Optimize XLS (BIFF8) reader performance with reduced per-cell overhead#4833

Open
kemo wants to merge 4 commits intoPHPOffice:masterfrom
kemo:perf/xls-reader-optimizations
Open

Optimize XLS (BIFF8) reader performance with reduced per-cell overhead#4833
kemo wants to merge 4 commits intoPHPOffice:masterfrom
kemo:perf/xls-reader-optimizations

Conversation

@kemo
Copy link
Copy Markdown
Contributor

@kemo kemo commented Mar 11, 2026

Summary

  • Eliminates redundant updateInCollection() calls during cell reads by introducing Cell::setXfIndexNoUpdate() and reordering XF index assignment before setValueExplicit() across all 9 cell reader methods (readRk, readLabelSst, readNumber, readFormula, readBoolErr, readLabel, readMulRk, readMulBlank, readBlank)
  • Replaces O(n²) char-by-char string concatenation in SST reader with O(n) str_split/implode for both compressed-to-uncompressed expansion and CONTINUE record splicing
  • Caches read filter reference and sheet title per-sheet to avoid repeated method calls on every cell
  • Pre-computes cell coordinate strings ($columnString . $rowIndex) in all cell reader methods instead of inline concatenation
  • Caches SST entry locally in readLabelSst and formula result type byte in readFormula to reduce repeated lookups

Changes

  • src/PhpSpreadsheet/Cell/Cell.php: New setXfIndexNoUpdate() internal method that sets XF index without triggering updateInCollection()
  • src/PhpSpreadsheet/Reader/Xls.php: All 9 cell reader methods optimized, SST string concatenation replaced, cached $phpSheetTitle and $cachedReadFilter properties added, splice offset loops converted to indexed for-loops
  • src/PhpSpreadsheet/Reader/Xls/LoadSpreadsheet.php: Sets cached read filter and sheet title at the start of each sheet iteration

Benchmark results (best of 3 runs, 5 iterations each)

Benchmark Master Optimized Change
Real XLS fixtures (6 files) 119.12 ms 114.64 ms -3.8%
Synthetic (150k cells, 5 files) 6,606 ms 6,461 ms -2.2%
SST-heavy (100k string cells) 4,197 ms 3,977 ms -5.3%

Test plan

  • Verify all 82 XLS reader tests pass (1425 assertions)
  • Verify cell values, styles, and XF indices are identical to master
  • Verify rich text formatting in SST strings is preserved
  • Verify shared formulas resolve correctly
  • Verify blank cells retain their style information

@oleibman
Copy link
Copy Markdown
Collaborator

Is there a Benchmark test you can add to this PR?

@kemo
Copy link
Copy Markdown
Contributor Author

kemo commented Mar 24, 2026

Sure — I'll add a benchmark test for the BIFF8 reader to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants