Skip to content

Commit 9364ca2

Browse files
committed
Review and rewrite regeneration.txt help file. #15385
1 parent d2877e9 commit 9364ca2

File tree

1 file changed

+43
-34
lines changed

1 file changed

+43
-34
lines changed

help/regeneration.txt

Lines changed: 43 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@ resource (binary) files, others are Java source files that are stored
77

88
If you're reading this, chances are that:
99

10-
1) you've hit a precommit check error that said you've modified a generated
10+
1) you've hit a build check error that said you've modified a generated
1111
resource and some checksums are out of sync.
1212

13-
2) you need to regenerate one (or more) of these resources.
13+
2) you need to regenerate one (or more) of these resources because of
14+
third party library updates or other reasons.
1415

1516
In many cases hitting (1) means you'll have to do (2) so let's discuss
1617
these in order.
@@ -20,44 +21,58 @@ Checksum validation errors
2021
--------------------------
2122

2223
LUCENE-9868 introduced a system of storing (and validating) checksums of
23-
generated files so that they are not accidentally modified. This checksums
24+
generated files so that they are not accidentally modified. This checksum-checking
2425
system will fail the build with a message similar to this one:
2526

26-
Execution failed for task ':lucene:core:generateStandardTokenizerChecksumCheck'.
27-
> Checksums mismatch for derived resources; you might have modified a generated resource (regenerate task: :lucene:core:generateStandardTokenizerIfChanged):
28-
Actual:
29-
lucene/core/[...]/StandardTokenizerImpl.java=3298326986432483248962398462938649869326
27+
Execution failed for task ':lucene:core:regenerateStandardTokenizerChecksumCheck'.
28+
> Checksums mismatch for generated resources; you might have modified a generated resource (regenerate task: regenerateStandardTokenizer, checksum file: .../lucene/core/src/generated/checksums/regenerateStandardTokenizer.json).
3029

31-
Expected:
32-
lucene/core/[...]/StandardTokenizerImpl.java=8e33c2698446c1c7a9479796a41316d1932ceda8
30+
Current checksums:
31+
- lucene/core/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java=f4d6311e03a97cb55301cfa55fab3338b1fdb725
32+
33+
Expected checksums:
34+
- lucene/core/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java=f4d6311e03a97cb55301cfa55fab3338b1fdb724
35+
36+
Input files for this task are:
37+
- .../lucene/core/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex
38+
- .../lucene/gradle/regenerate/jflex/skeleton.disable.buffer.expansion.txt
39+
40+
Files generated by this task are:
41+
- .../lucene/core/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.java
3342

3443
The message shows you which resources have mismatches on checksums (in this case
3544
StandardTokenizerImpl.java) but also the *module* where the generated
36-
resource exists and the *task name* that should be used to regenerate this resource:
45+
resource exists, the *task name* that should be used to regenerate this resource
46+
and the inputs and outputs of that task:
47+
48+
:lucene:core:regenerateStandardTokenizer
3749

38-
:lucene:core:generateStandardTokenizerIfChanged
50+
To resolve the problem, try to "git diff" the changes that caused the build failure
51+
(to see why the checksums have changed). Then decide if the change was accidental
52+
or intentional and either:
3953

40-
To resolve the problem, try to:
54+
1) revert any accidental changes made to inputs/outputs of generated resources,
4155

42-
1) "git diff" the changes that caused the build failure (to see why the checksums
43-
changed) and then decide whether to update the generated resource's template (or whatever
44-
it is using to emit the generated resource);
56+
2) regenerate the derived resources, possibly saving new checksums.
4557

46-
2) regenerate the derived resources, possibly saving new checksums. If you decide to
47-
regenerate, just run the task hinted at in the error message, for example:
58+
If you decide to regenerate, just running the task hinted at in the error message
59+
should work fine, for example:
4860

49-
gradlew :lucene:core:generateStandardTokenizerIfChanged
61+
gradlew :lucene:core:regenerateStandardTokenizer
5062

5163
This regenerates all resources the task "generateStandardTokenizer" produces
52-
and updates the corresponding checksums.
64+
and updates the corresponding checksums. If you want to bypass any of gradle's
65+
smart up-to-date checks and force regeneration, use the standard gradle's switch:
66+
67+
gradlew :lucene:core:regenerateStandardTokenizer --rerun-tasks
5368

5469

5570
Resource regeneration
5671
---------------------
5772

58-
The "convention" task for regenerating all derived resources in a given
59-
module is called "regenerate" and you can apply it to all Lucene modules
60-
by running:
73+
Lucene's "convention" task for regenerating all derived resources in a given
74+
module is called "regenerate" and you can apply it to all Lucene modules at
75+
once by running:
6176

6277
gradlew regenerate
6378

@@ -105,18 +120,12 @@ gradlew -p lucene/analysis/common regenerate --console=plain
105120
...
106121

107122
This shouldn't worry you at all - the internal tasks are skipped because
108-
the inputs and outputs of these task have not changed. If they have changed,
109-
the task is re-run and followed up by other tasks, such as code-formatting (tidy).
110-
111-
Of course, sometimes you may want to *force* the regeneration task to run, even if the
112-
checksums indicate nothing has changed. This may happen because of several reasons:
113-
114-
- the generation task has outputs but no inputs or the inputs are volatile. In this case
115-
only the outputs have checksums and the task will be skipped if the outputs haven't changed.
123+
the inputs and outputs of these task are up to date. If changes were detected, the task
124+
would be re-executed, followed by any necessary cleanup tasks (like code formatting).
116125

117-
- you may want to run the regeneration task just to see that it actually runs and produces
118-
the same checksums (git diff should be clean). This would be a wise periodic sanity check
119-
to ensure everything works as expected.
126+
Sometimes you may want to *force* the regeneration task to run, even if the
127+
checksums indicate nothing has changed. This may happen if you're paranoid or don't trust
128+
gradle (and there's no reason you should).
120129

121130
If you want to force-run the regeneration, use gradle's "--rerun-tasks" option:
122131

@@ -131,7 +140,7 @@ Scoping the call to a particular task will also work:
131140
gradlew -p lucene/analysis/common regenerateUnicodeProps --rerun-tasks
132141

133142
Finally, if you do feel like force-regenerating everything, remember to exclude this
134-
monster...
143+
monster (it takes quite a long time and requires significant heap memory)...
135144

136145
gradlew regenerate -x regenerateUAX29URLEmailTokenizer --rerun-tasks
137146

0 commit comments

Comments
 (0)