Skip to content

Commit 6dbf4b8

Browse files
derrickstoleegitster
authored andcommitted
commit-graph: declare bankruptcy on GDAT chunks
The Generation Data (GDAT) and Generation Data Overflow (GDOV) chunks store corrected commit date offsets, used for generation number v2. Recent changes have demonstrated that previous versions of Git were incorrectly parsing data from these chunks, but might have also been writing them incorrectly. I asserted [1] that the previous fixes were sufficient because the known reasons for incorrectly writing generation number v2 data relied on parsing the information incorrectly out of a commit-graph file, but the previous versions of Git were not reading the generation number v2 data. However, Patrick demonstrated [2] a case where in split commit-graphs across an alternate boundary (and possibly some other special conditions) it was possible to have a commit-graph that was generated by a previous version of Git have incorrect generation number v2 data which results in errors like the following: commit-graph generation for commit <oid> is 1623273624 < 1623273710 [1] https://lore.kernel.org/git/[email protected]/ [2] https://lore.kernel.org/git/Yh93vOkt2DkrGPh2@ncase/ Clearly, there is something else going on. The situation is not completely understood, but the errors do not reproduce if the commit-graphs are all generated by a Git version including these recent fixes. If we cannot trust the existing data in the GDAT and GDOV chunks, then we can alter the format to change the chunk IDs for these chunks. This causes the new version of Git to silently ignore the older chunks (and disabling generation number v2 in the process) while writing new commit-graph files with correct data in the GDA2 and GDO2 chunks. Update commit-graph-format.txt including a historical note about these deprecated chunks. Reported-by: Patrick Steinhardt <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent c8d67b9 commit 6dbf4b8

File tree

2 files changed

+12
-4
lines changed

2 files changed

+12
-4
lines changed

Documentation/technical/commit-graph-format.txt

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ CHUNK DATA:
9393
2 bits of the lowest byte, storing the 33rd and 34th bit of the
9494
commit time.
9595

96-
Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional]
96+
Generation Data (ID: {'G', 'D', 'A', '2' }) (N * 4 bytes) [Optional]
9797
* This list of 4-byte values store corrected commit date offsets for the
9898
commits, arranged in the same order as commit data chunk.
9999
* If the corrected commit date offset cannot be stored within 31 bits,
@@ -104,7 +104,7 @@ CHUNK DATA:
104104
by compatible versions of Git and in case of split commit-graph chains,
105105
the topmost layer also has Generation Data chunk.
106106

107-
Generation Data Overflow (ID: {'G', 'D', 'O', 'V' }) [Optional]
107+
Generation Data Overflow (ID: {'G', 'D', 'O', '2' }) [Optional]
108108
* This list of 8-byte values stores the corrected commit date offsets
109109
for commits with corrected commit date offsets that cannot be
110110
stored within 31 bits.
@@ -156,3 +156,11 @@ CHUNK DATA:
156156
TRAILER:
157157

158158
H-byte HASH-checksum of all of the above.
159+
160+
== Historical Notes:
161+
162+
The Generation Data (GDA2) and Generation Data Overflow (GDO2) chunks have
163+
the number '2' in their chunk IDs because a previous version of Git wrote
164+
possibly erroneous data in these chunks with the IDs "GDAT" and "GDOV". By
165+
changing the IDs, newer versions of Git will silently ignore those older
166+
chunks and write the new information without trusting the incorrect data.

commit-graph.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,8 @@ void git_test_write_commit_graph_or_die(void)
3939
#define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
4040
#define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */
4141
#define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */
42-
#define GRAPH_CHUNKID_GENERATION_DATA 0x47444154 /* "GDAT" */
43-
#define GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW 0x47444f56 /* "GDOV" */
42+
#define GRAPH_CHUNKID_GENERATION_DATA 0x47444132 /* "GDA2" */
43+
#define GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW 0x47444f32 /* "GDO2" */
4444
#define GRAPH_CHUNKID_EXTRAEDGES 0x45444745 /* "EDGE" */
4545
#define GRAPH_CHUNKID_BLOOMINDEXES 0x42494458 /* "BIDX" */
4646
#define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */

0 commit comments

Comments
 (0)