Skip to content

Commit 2f445a1

Browse files
committed
Change Hash Code for Worksheet
Fix #4192. Although that issue can be dealt with by changing user code, it would be better to fix it within PhpSpreadsheet. A cloned worksheet may have a pointer to a spreadsheet to which it is not attached. Code can assume it does belong to the spreadsheet, and throw an exception when the spreadsheet cannot find the worksheet in question. It may also not throw an exception when it should. In my comments to the issue, I was concerned that adding in the needed protection would add overhead to an extremely common situation (setting a cell's value) in order to avoid a pretty rare problem. However, there are problems with both the accuracy and efficiency of the existing code, and I think any performance losses caused by the additional checks will be offset by the performance gains and accuracy of the new code. Spreadsheet `getIndex` attempts to find the index of a worksheet within its spreadsheet collection. It does so by comparing the hash codes of each sheet in its collection with the hash code of the sheet it is looking for. Its major problem problem is performance-related, namely that it recomputes the hash code of the target sheet with each iteration. A more severe problem is the accuracy of the hash code. It generates this by hashing together the sheet title, the string range of its auto-filter, and a character representation of whether sheet protection is enabled. Title should definitely be part of the calculation (it must be unique for all sheets attached to a spreadsheet), but it is not clear why this subset of the other properties of Worksheet is used. It tries to save some cycles by using a `dirty` property to indicate whether re-hashing is necessary. It sets that property whenever the title changes, or when `setProtection` is called. So, it doesn't set it when auto-filter changes, and you can easily bypass `setProtection` when changing any of the `Protection` properties. Not to mention the many other properties of worksheet that can be changed. Additionally, if you clone a worksheet, the clone and the original will have the same hash code, which can lead to problems: ```php $clone = clone $original; $spreadsheet->getSheet($spreadsheet->getIndex($clone)) ->setCellValue('A1', 100); ``` That code will change the value of A1 in the original, not the clone. The `hash` property in Worksheet will now be calculated immediately when the object is constructed or cloned or unserialized. It will not be recalculated, and there is no longer a need for the `dirty` property, which is removed. Hash will be generated by spl_object_id, which was designed for this purpose. (So was spl_object_hash, but many online references suggest that \_id performs much better than \_hash.) Our problem example above will now throw an Exception, as it should, rather than changing the wrong cell. `setValueExplicit`, the problem in the original issue, will now test that the worksheet is attached to the spreadsheet before doing any style manipulation. In order that this not be a breaking change, `getHashCode` will continue to return string, but it is deprecated in favor of `getHashInt`, and Worksheet will no longer implement IComparable to facilitate the deprecation. I had a vague hope that this change might help with issue #641. It doesn't.
1 parent 8799a04 commit 2f445a1

File tree

6 files changed

+92
-24
lines changed

6 files changed

+92
-24
lines changed

src/PhpSpreadsheet/Cell/Cell.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -324,7 +324,7 @@ public function setValueExplicit(mixed $value, string $dataType = DataType::TYPE
324324
self::updateIfCellIsTableHeader($this->getParent()?->getParent(), $this, $oldValue, $value);
325325
$worksheet = $this->getWorksheet();
326326
$spreadsheet = $worksheet->getParent();
327-
if (isset($spreadsheet)) {
327+
if (isset($spreadsheet) && $spreadsheet->getIndex($worksheet, true) >= 0) {
328328
$originalSelected = $worksheet->getSelectedCells();
329329
$activeSheetIndex = $spreadsheet->getActiveSheetIndex();
330330
$style = $this->getStyle();

src/PhpSpreadsheet/ReferenceHelper.php

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -921,7 +921,7 @@ private function updateNamedRange(DefinedName $definedName, Worksheet $worksheet
921921
{
922922
$cellAddress = $definedName->getValue();
923923
$asFormula = ($cellAddress[0] === '=');
924-
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashCode() === $worksheet->getHashCode()) {
924+
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashInt() === $worksheet->getHashInt()) {
925925
/**
926926
* If we delete the entire range that is referenced by a Named Range, MS Excel sets the value to #REF!
927927
* PhpSpreadsheet still only does a basic adjustment, so the Named Range will still reference Cells.
@@ -940,7 +940,7 @@ private function updateNamedRange(DefinedName $definedName, Worksheet $worksheet
940940

941941
private function updateNamedFormula(DefinedName $definedName, Worksheet $worksheet, string $beforeCellAddress, int $numberOfColumns, int $numberOfRows): void
942942
{
943-
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashCode() === $worksheet->getHashCode()) {
943+
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashInt() === $worksheet->getHashInt()) {
944944
/**
945945
* If we delete the entire range that is referenced by a Named Formula, MS Excel sets the value to #REF!
946946
* PhpSpreadsheet still only does a basic adjustment, so the Named Formula will still reference Cells.

src/PhpSpreadsheet/Spreadsheet.php

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -660,13 +660,17 @@ public function getSheetByNameOrThrow(string $worksheetName): Worksheet
660660
*
661661
* @return int index
662662
*/
663-
public function getIndex(Worksheet $worksheet): int
663+
public function getIndex(Worksheet $worksheet, bool $noThrow = false): int
664664
{
665+
$wsHash = $worksheet->getHashInt();
665666
foreach ($this->workSheetCollection as $key => $value) {
666-
if ($value->getHashCode() === $worksheet->getHashCode()) {
667+
if ($value->getHashInt() === $wsHash) {
667668
return $key;
668669
}
669670
}
671+
if ($noThrow) {
672+
return -1;
673+
}
670674

671675
throw new Exception('Sheet does not exist.');
672676
}

src/PhpSpreadsheet/Worksheet/BaseDrawing.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -414,7 +414,7 @@ public function getHashCode(): string
414414
return md5(
415415
$this->name
416416
. $this->description
417-
. (($this->worksheet === null) ? '' : $this->worksheet->getHashCode())
417+
. (($this->worksheet === null) ? '' : (string) $this->worksheet->getHashInt())
418418
. $this->coordinates
419419
. $this->offsetX
420420
. $this->offsetY

src/PhpSpreadsheet/Worksheet/Worksheet.php

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
use PhpOffice\PhpSpreadsheet\Comment;
2121
use PhpOffice\PhpSpreadsheet\DefinedName;
2222
use PhpOffice\PhpSpreadsheet\Exception;
23-
use PhpOffice\PhpSpreadsheet\IComparable;
2423
use PhpOffice\PhpSpreadsheet\ReferenceHelper;
2524
use PhpOffice\PhpSpreadsheet\RichText\RichText;
2625
use PhpOffice\PhpSpreadsheet\Shared;
@@ -32,7 +31,7 @@
3231
use PhpOffice\PhpSpreadsheet\Style\Protection as StyleProtection;
3332
use PhpOffice\PhpSpreadsheet\Style\Style;
3433

35-
class Worksheet implements IComparable
34+
class Worksheet
3635
{
3736
// Break types
3837
public const BREAK_NONE = 0;
@@ -305,15 +304,10 @@ class Worksheet implements IComparable
305304
*/
306305
private ?Color $tabColor = null;
307306

308-
/**
309-
* Dirty flag.
310-
*/
311-
private bool $dirty = true;
312-
313307
/**
314308
* Hash.
315309
*/
316-
private string $hash;
310+
private int $hash;
317311

318312
/**
319313
* CodeName.
@@ -355,6 +349,7 @@ public function __construct(?Spreadsheet $parent = null, string $title = 'Worksh
355349
$this->autoFilter = new AutoFilter('', $this);
356350
// Table collection
357351
$this->tableCollection = new ArrayObject();
352+
$this->hash = spl_object_id($this);
358353
}
359354

360355
/**
@@ -383,6 +378,12 @@ public function __destruct()
383378
unset($this->rowDimensions, $this->columnDimensions, $this->tableCollection, $this->drawingCollection, $this->chartCollection, $this->autoFilter);
384379
}
385380

381+
public function __wakeup(): void
382+
{
383+
$this->hash = spl_object_id($this);
384+
$this->parent = null;
385+
}
386+
386387
/**
387388
* Return the cell collection.
388389
*/
@@ -897,7 +898,6 @@ public function setTitle(string $title, bool $updateFormulaCellReferences = true
897898

898899
// Set title
899900
$this->title = $title;
900-
$this->dirty = true;
901901

902902
if ($this->parent && $this->parent->getCalculationEngine()) {
903903
// New title
@@ -1032,7 +1032,6 @@ public function getProtection(): Protection
10321032
public function setProtection(Protection $protection): static
10331033
{
10341034
$this->protection = $protection;
1035-
$this->dirty = true;
10361035

10371036
return $this;
10381037
}
@@ -3014,7 +3013,7 @@ private function validateNamedRange(string $definedName, bool $returnNullIfInval
30143013

30153014
if ($namedRange->getLocalOnly()) {
30163015
$worksheet = $namedRange->getWorksheet();
3017-
if ($worksheet === null || $this->getHashCode() !== $worksheet->getHashCode()) {
3016+
if ($worksheet === null || $this->hash !== $worksheet->getHashInt()) {
30183017
if ($returnNullIfInvalid) {
30193018
return null;
30203019
}
@@ -3154,17 +3153,15 @@ public function garbageCollect(): static
31543153
}
31553154

31563155
/**
3157-
* Get hash code.
3158-
*
3159-
* @return string Hash code
3156+
* @deprecated 3.5.0 use getHashInt instead.
31603157
*/
31613158
public function getHashCode(): string
31623159
{
3163-
if ($this->dirty) {
3164-
$this->hash = md5($this->title . $this->autoFilter . ($this->protection->isProtectionEnabled() ? 't' : 'f') . __CLASS__);
3165-
$this->dirty = false;
3166-
}
3160+
return (string) $this->hash;
3161+
}
31673162

3163+
public function getHashInt(): int
3164+
{
31683165
return $this->hash;
31693166
}
31703167

@@ -3493,6 +3490,7 @@ public function __clone()
34933490
}
34943491
}
34953492
}
3493+
$this->hash = spl_object_id($this);
34963494
}
34973495

34983496
/**
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
namespace PhpOffice\PhpSpreadsheetTests\Worksheet;
6+
7+
use PhpOffice\PhpSpreadsheet\Exception as SpreadsheetException;
8+
use PhpOffice\PhpSpreadsheet\Spreadsheet;
9+
use PhpOffice\PhpSpreadsheet\Worksheet\Worksheet;
10+
use PHPUnit\Framework\TestCase;
11+
12+
class CloneTest extends TestCase
13+
{
14+
public function testUnattachedIndex(): void
15+
{
16+
$spreadsheet = new Spreadsheet();
17+
$sheet1 = $spreadsheet->getActiveSheet();
18+
$sheet1->getCell('A1')->setValue(10);
19+
$sheet2 = clone $sheet1;
20+
$sheet2->getCell('A1')->setValue(20);
21+
self::assertSame(0, $spreadsheet->getIndex($sheet1));
22+
$idx = $spreadsheet->getIndex($sheet2, true);
23+
self::assertSame(-1, $idx);
24+
$sheet2->setTitle('clone');
25+
$spreadsheet->addSheet($sheet2);
26+
$idx = $spreadsheet->getIndex($sheet2, true);
27+
self::assertSame(1, $idx);
28+
self::assertSame(10, $spreadsheet->getSheet(0)->getCell('A1')->getValue());
29+
self::assertSame(20, $spreadsheet->getSheet(1)->getCell('A1')->getValue());
30+
$spreadsheet->disconnectWorksheets();
31+
}
32+
33+
public function testGetCloneIndex(): void
34+
{
35+
$this->expectException(SpreadsheetException::class);
36+
$this->expectExceptionMessage('Sheet does not exist');
37+
$spreadsheet = new Spreadsheet();
38+
$sheet1 = $spreadsheet->getActiveSheet();
39+
$sheet1->getCell('A1')->setValue(10);
40+
$sheet2 = clone $sheet1;
41+
$spreadsheet->getSheet($spreadsheet->getIndex($sheet2))
42+
->setCellValue('A1', 100);
43+
}
44+
45+
public function testSerialize1(): void
46+
{
47+
// If worksheet attached to spreadsheet, can't serialize it.
48+
$this->expectException(SpreadsheetException::class);
49+
$this->expectExceptionMessage('cannot be serialized');
50+
$spreadsheet = new Spreadsheet();
51+
$sheet1 = $spreadsheet->getActiveSheet();
52+
serialize($sheet1);
53+
}
54+
55+
#[\PHPUnit\Framework\Attributes\RunInSeparateProcess]
56+
public function testSerialize2(): void
57+
{
58+
$sheet1 = new Worksheet();
59+
$sheet1->getCell('A1')->setValue(10);
60+
$serialized = serialize($sheet1);
61+
/** @var Worksheet */
62+
$newSheet = unserialize($serialized);
63+
self::assertSame(10, $newSheet->getCell('A1')->getValue());
64+
self::assertNotEquals($newSheet->getHashInt(), $sheet1->getHashInt());
65+
}
66+
}

0 commit comments

Comments
 (0)