Empty strings inside of diff instead of content

`dimapa` seems to be able to generate diffs with empty strings in them instead of the actual content of the diff when doing semantic cleanup. Something we found in production, when diffing is that the following script will return 10 lines fewer than were originally in the file:
```ruby
require 'dimapa'

# https://raw.githubusercontent.com/boxed/mutmut/refs/tags/2.0.0/tests/test_mutation.py
before = "        ('break', 'continue'),\n    ]\n)\ndef test_basic_mutations(original, expected):\n    actual, number_of_performed_mutations = mutate(Context(source=original, mutation_id=ALL, dict_synonyms=['Struct', 'FooBarDict']))\n    assert actual == expected, 'Performed {} mutations for original \"{}\"'.format(number_of_performed_mutations, original)\n\n\n@pytest.mark.parametrize(\n    'original, expected', [\n        ('x+=1', ['x=1', 'x-=1']),\n        ('x-=1', ['x=1', 'x+=1']),\n        ('x*=1', ['x=1', 'x/=1']),\n        ('x/=1', ['x=1', 'x*=1']),\n        ('x//=1', ['x=1', 'x/=1']),\n        ('x%=1', ['x=1', 'x/=1']),\n        ('x<<=1', ['x=1', 'x>>=1']),\n        ('x>>=1', ['x=1', 'x<<=1']),\n        ('x&=1', ['x=1', 'x|=1']),\n        ('x|=1', ['x=1', 'x&=1']),\n        ('x^=1', ['x=1', 'x&=1']),\n        ('x**=1', ['x=1', 'x*=1']),\n    ]\n)\ndef test_multiple_mutations(original, expected):\n    mutations = list_mutations(Context(source=original))\n    assert len(mutations) == 3\n    assert mutate(Context(source=original, mutation_id=mutations[0])) == (expected[0], 1)\n    assert mutate(Context(source=original, mutation_id=mutations[1])) == (expected[1], 1)\n\n\n@pytest.mark.parametrize(\n    'original, expected', [\n        ('a: int = 1', 'a: int = None'),\n        ('a: Optional[int] = None', 'a: Optional[int] = \"\"'),\n"
# https://raw.githubusercontent.com/boxed/mutmut/refs/tags/3.2.2/tests/test_mutation.py
after = "        ('x+=1', ['x=1', 'x-=1', 'x+=2']),\n        ('x-=1', ['x=1', 'x+=1', 'x-=2']),\n        ('x*=1', ['x=1', 'x/=1', 'x*=2']),\n        ('x/=1', ['x=1', 'x*=1', 'x/=2']),\n        ('x//=1', ['x=1', 'x/=1', 'x//=2']),\n        ('x%=1', ['x=1', 'x/=1', 'x%=2']),\n        ('x<<=1', ['x=1', 'x>>=1', 'x<<=2']),\n        ('x>>=1', ['x=1', 'x<<=1', 'x>>=2']),\n        ('x&=1', ['x=1', 'x|=1', 'x&=2']),\n        ('x|=1', ['x=1', 'x&=1', 'x|=2']),\n        ('x^=1', ['x=1', 'x&=1', 'x^=2']),\n        ('x**=1', ['x=1', 'x*=1', 'x**=2']),\n"

dimapa = DiMaPa.new

diff = dimapa.diff_main(before, after)
# Returns the diff in full here
p diff

dimapa.diff_cleanup_semantic(diff)
# Returns the diff with the first 10 lines missing
p diff
```

Trying with python 3 diff_match_patch and the original implementation of diff_match_patch in ruby, I found that the python 3 version works just fine in this scenario, but none of the ruby versions do, so the algorithm works, but there is a bug in the implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty strings inside of diff instead of content #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Empty strings inside of diff instead of content #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions