-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
Bug report
Bug description:
difflib.HtmlDiff.make_table
(and make_file
as well) generate non-deterministic results without this being documented anywhere.
Background: I am using this functionality and run some integration/unit tests for my own code which checks the HTML output as well. Doing so I discovered that the results would differ, depending on the execution order of the tests itself.
Let's consider this simple example:
import shutil
from difflib import HtmlDiff
from tempfile import NamedTemporaryFile
with NamedTemporaryFile() as file1, NamedTemporaryFile() as file2:
file1.write(b'Hello World!\n')
file2.write(b'Foo Bar\n')
file1.seek(0)
file2.seek(0)
html1 = HtmlDiff().make_table(fromlines=['Line 1\n', 'Line 2\n'], tolines=['Line 1\n', 'Line 3\n'])
html2 = HtmlDiff().make_table(fromlines=['Line 1\n', 'Line 2\n'], tolines=['Line 1\n', 'Line 3\n'])
print(html1)
print('=' * shutil.get_terminal_size().columns)
print(html2)
assert html1 == html2
This will fail with an assertion error due to different HTML:
<table class="diff" id="difflib_chg_to0__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to0__0"><a href="#difflib_chg_to0__0">f</a></td><td class="diff_header" id="from0_1">1</td><td nowrap="nowrap">Line 1</td><td class="diff_next"><a href="#difflib_chg_to0__0">f</a></td><td class="diff_header" id="to0_1">1</td><td nowrap="nowrap">Line 1</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to0__top">t</a></td><td class="diff_header" id="from0_2">2</td><td nowrap="nowrap">Line <span class="diff_chg">2</span></td><td class="diff_next"><a href="#difflib_chg_to0__top">t</a></td><td class="diff_header" id="to0_2">2</td><td nowrap="nowrap">Line <span class="diff_chg">3</span></td></tr>
</tbody>
</table>
================================================================================
<table class="diff" id="difflib_chg_to1__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to1__0"><a href="#difflib_chg_to1__0">f</a></td><td class="diff_header" id="from1_1">1</td><td nowrap="nowrap">Line 1</td><td class="diff_next"><a href="#difflib_chg_to1__0">f</a></td><td class="diff_header" id="to1_1">1</td><td nowrap="nowrap">Line 1</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to1__top">t</a></td><td class="diff_header" id="from1_2">2</td><td nowrap="nowrap">Line <span class="diff_chg">2</span></td><td class="diff_next"><a href="#difflib_chg_to1__top">t</a></td><td class="diff_header" id="to1_2">2</td><td nowrap="nowrap">Line <span class="diff_chg">3</span></td></tr>
</tbody>
</table>
The specific issue is that both tables have a different index. Digging through the code, this is due to all instances using the same counter _default_prefix
, referenced through the class instead of self
:
Lines 1886 to 1895 in 78aeb38
def _make_prefix(self): | |
"""Create unique anchor prefixes""" | |
# Generate a unique anchor prefix so multiple tables | |
# can exist on the same HTML page without conflicts. | |
fromprefix = "from%d_" % HtmlDiff._default_prefix | |
toprefix = "to%d_" % HtmlDiff._default_prefix | |
HtmlDiff._default_prefix += 1 | |
# store prefixes so line format method has access | |
self._prefix = [fromprefix,toprefix] |
CPython versions tested on:
3.9, 3.11, CPython main branch
Operating systems tested on:
Linux