Skip to content

Empty strings inside of diff instead of content #4

@hellcp-work

Description

@hellcp-work

dimapa seems to be able to generate diffs with empty strings in them instead of the actual content of the diff when doing semantic cleanup. Something we found in production, when diffing is that the following script will return 10 lines fewer than were originally in the file:

require 'dimapa'

# https://raw.githubusercontent.com/boxed/mutmut/refs/tags/2.0.0/tests/test_mutation.py
before = "        ('break', 'continue'),\n    ]\n)\ndef test_basic_mutations(original, expected):\n    actual, number_of_performed_mutations = mutate(Context(source=original, mutation_id=ALL, dict_synonyms=['Struct', 'FooBarDict']))\n    assert actual == expected, 'Performed {} mutations for original \"{}\"'.format(number_of_performed_mutations, original)\n\n\n@pytest.mark.parametrize(\n    'original, expected', [\n        ('x+=1', ['x=1', 'x-=1']),\n        ('x-=1', ['x=1', 'x+=1']),\n        ('x*=1', ['x=1', 'x/=1']),\n        ('x/=1', ['x=1', 'x*=1']),\n        ('x//=1', ['x=1', 'x/=1']),\n        ('x%=1', ['x=1', 'x/=1']),\n        ('x<<=1', ['x=1', 'x>>=1']),\n        ('x>>=1', ['x=1', 'x<<=1']),\n        ('x&=1', ['x=1', 'x|=1']),\n        ('x|=1', ['x=1', 'x&=1']),\n        ('x^=1', ['x=1', 'x&=1']),\n        ('x**=1', ['x=1', 'x*=1']),\n    ]\n)\ndef test_multiple_mutations(original, expected):\n    mutations = list_mutations(Context(source=original))\n    assert len(mutations) == 3\n    assert mutate(Context(source=original, mutation_id=mutations[0])) == (expected[0], 1)\n    assert mutate(Context(source=original, mutation_id=mutations[1])) == (expected[1], 1)\n\n\n@pytest.mark.parametrize(\n    'original, expected', [\n        ('a: int = 1', 'a: int = None'),\n        ('a: Optional[int] = None', 'a: Optional[int] = \"\"'),\n"
# https://raw.githubusercontent.com/boxed/mutmut/refs/tags/3.2.2/tests/test_mutation.py
after = "        ('x+=1', ['x=1', 'x-=1', 'x+=2']),\n        ('x-=1', ['x=1', 'x+=1', 'x-=2']),\n        ('x*=1', ['x=1', 'x/=1', 'x*=2']),\n        ('x/=1', ['x=1', 'x*=1', 'x/=2']),\n        ('x//=1', ['x=1', 'x/=1', 'x//=2']),\n        ('x%=1', ['x=1', 'x/=1', 'x%=2']),\n        ('x<<=1', ['x=1', 'x>>=1', 'x<<=2']),\n        ('x>>=1', ['x=1', 'x<<=1', 'x>>=2']),\n        ('x&=1', ['x=1', 'x|=1', 'x&=2']),\n        ('x|=1', ['x=1', 'x&=1', 'x|=2']),\n        ('x^=1', ['x=1', 'x&=1', 'x^=2']),\n        ('x**=1', ['x=1', 'x*=1', 'x**=2']),\n"

dimapa = DiMaPa.new

diff = dimapa.diff_main(before, after)
# Returns the diff in full here
p diff

dimapa.diff_cleanup_semantic(diff)
# Returns the diff with the first 10 lines missing
p diff

Trying with python 3 diff_match_patch and the original implementation of diff_match_patch in ruby, I found that the python 3 version works just fine in this scenario, but none of the ruby versions do, so the algorithm works, but there is a bug in the implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions