Skip to content

Commit 02ae9a3

Browse files
committed
Merge branch 'develop' into fix-configure
2 parents 1d227f5 + 1fb58fa commit 02ae9a3

File tree

59 files changed

+1045
-383
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+1045
-383
lines changed

README.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
DiffDetective is an open-source Java library for variability-aware source code differencing and the **analysis of version histories of software product lines**. This means that DiffDetective can **turn a generic differencer into a variability-aware differencer** by means of a pre- or post-processing. DiffDetective is centered around **formally verified** data structures for variability (variation trees) and variability-aware diffs (variation diffs). These data structures are **generic**, and DiffDetective currently implements **C preprocessor support** to parse respective annotations when used to implement variability. The picture below depicts the process of variability-aware differencing.
1212

13-
<img alt="Variability-Aware Differencing Overview" src="docs/teaser.png" height="500" />
13+
<img alt="Variability-Aware Differencing Overview" src="src/main/java/org/variantsync/diffdetective/variation/diff/doc-files/variability-aware-differencing.png" height="500" />
1414

1515
Given two states of a C-preprocessor annotated source code file (left), for example before and after a commit, DiffDetective constructs a variability-aware diff (right) that distinguishes changes to source code from changes to variability annotations. DiffDetective can construct such a variation diff either, by first using a generic differencer, and separating the information (center path), or by first parsing both input versions to an abstract representation, a variation tree (center top and bottom), and constructing a variation diff using a tree differencing algorithm in a second step.
1616

@@ -80,6 +80,18 @@ Additionally, there is a screencast available on YouTube, guiding you through th
8080
[![DiffDetective Demonstration](docs/yt_thumbnail.png)](https://www.youtube.com/watch?v=q6ight5EDQY)
8181

8282

83+
## Supported Differencing Algorithms
84+
85+
In principle, any generic differencing algorithm (i.e, any algorithm that may operate on text or trees) can be made variability-aware with DiffDetective, as explained in our demo paper (see below). Some algorithms are integrated directly in the DiffDetective library, while others come as additional Maven projects.
86+
87+
### Shipped with DiffDetective
88+
- Git Diff as implemented by [JGit](https://github.com/eclipse-jgit/jgit)
89+
- [GumTree](https://github.com/GumTreeDiff/gumtree), and all algorithms and matching engines supported by the GumTree library
90+
91+
### Extra Modules
92+
- [TrueDiff](https://gitlab.rlp.net/plmz/truediff): Support for TrueDiff comes as [a separate Maven project](https://github.com/VariantSync/TrueDiffDetective).
93+
94+
8395
## Publications
8496

8597
### Variability-Aware Differencing with DiffDetective (FSE 2024, ⭐ [Best Demo Paper](https://2024.esec-fse.org/info/awards) ⭐)
@@ -167,9 +179,10 @@ Edge-typed variation diffs and the replication package are implemented in a fork
167179

168180
DiffDetective was extended and used within bachelor's and master's theses:
169181

182+
- _Unparsing von Datenstrukturen zur Analyse von C-Präprozessor-Variabilität_, Eugen Shulimov, Bachelor's Thesis, 2025, [DOI 10.17619/UNIPB/1-2385](http://doi.org/10.17619/UNIPB/1-2385), (german): Eugen added an unparser for variation trees, essentially inverting the horizontal arrows in our commuting diagram at the top of this README file. The unparser for variation diffs reuses the unparser for variation trees by projecting a variation diff to its two variation trees (before and after the change), unparsing the trees, and then diffing the obtained text files to eventually compute a text-based diff.
170183
- _Constructing Variation Diffs Using Tree Diffing Algorithms_, Benjamin Moosherr, Bachelor's Thesis, 2023, [DOI 10.18725/OPARU-50108](https://dx.doi.org/10.18725/OPARU-50108): Benjamin added support for tree-differencing and integrated the GumTree differencer ([Github](https://github.com/GumTreeDiff/gumtree), [Paper](https://doi.org/10.1145/2642937.2642982)). In his thesis, Benjamin also reviewed a range of quality metrics for tree-diffs with focus on their applicability for rating variability-aware diffs. The [org.variantsync.diffdetective.experiments.thesis_bm](src/main/java/org/variantsync/diffdetective/experiments/thesis_bm) package implements the corresponding empirical study and may serve as an example on how to use the tree-differencing.
171184
- _Reverse Engineering Feature-Aware Commits From Software Product-Line Repositories_, Lukas Bormann, Bachelor's Thesis, 2023, [10.18725/OPARU-47892](https://dx.doi.org/10.18725/OPARU-47892): Lukas implemented an algorithm for feature-based commit-untangling, which turns variation diff into a series of smaller diffs, each of which contains an edit to a single feature or feature formula. This work was later refined in our publication _Views on Edits to Variational Software_ illustrated above.
172-
- _Inspecting the Evolution of Feature Annotations in Configurable Software_, Lukas Güthing, Master's Thesis, 2023: Lukas implemented different edge-types for associating variability annotations within variation diffs. He published his work later at VaMoS 2024 under the title _Explaining Edits to Variability Annotations in Evolving Software Product Lines_, illustrated above.
185+
- _Inspecting the Evolution of Feature Annotations in Configurable Software_, Lukas Güthing, Master's Thesis, 2023: Lukas implemented different edge-types for associating variability annotations within variation diffs. He published his work later at VaMoS 2024 under the title _Explaining Edits to Variability Annotations in Evolving Software Product Lines_, illustrated above. His work can be found in a [fork][forklg] of DiffDetective.
173186
- _Empirical Evaluation of Feature Trace Recording on the Edit History of Marlin_, Sören Viegener, Bachelor's Thesis, 2021, [DOI 10.18725/OPARU-38603](http://dx.doi.org/10.18725/OPARU-38603): In his thesis, Sören started the DiffDetective project and implemented the first version of an algorithm, which parses text-based diffs to C-preprocessor files to variation diffs. He also came up with an initial classification of edits, which we wanted to reuse to evaluate [Feature Trace Recording](https://variantsync.github.io/FeatureTraceRecording/), a method for deriving variability annotations from annotated patches.
174187

175188
[documentation]: https://variantsync.github.io/DiffDetective/docs/javadoc

docs/teaser.png

-286 KB
Binary file not shown.

src/main/java/org/variantsync/diffdetective/diff/git/GitDiffer.java

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,11 @@
2121
import org.variantsync.diffdetective.variation.DiffLinesLabel;
2222
import org.variantsync.diffdetective.variation.diff.VariationDiff;
2323
import org.variantsync.diffdetective.variation.diff.parse.VariationDiffParser;
24+
import org.variantsync.diffdetective.variation.tree.source.GitSource;
2425

2526
import java.io.*;
2627
import java.nio.charset.StandardCharsets;
28+
import java.nio.file.Path;
2729
import java.util.ArrayList;
2830
import java.util.List;
2931
import java.util.Optional;
@@ -252,6 +254,7 @@ private static CommitDiffResult getPatchDiffs(
252254

253255
final VariationDiff<DiffLinesLabel> variationDiff = VariationDiffParser.createVariationDiff(
254256
fullDiff,
257+
new GitSource(repository, childCommit.getId().name(), Path.of(filename)),
255258
repository.getParseOptions().variationDiffParseOptions()
256259
);
257260

src/main/java/org/variantsync/diffdetective/diff/git/GitPatch.java

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,19 @@
11
package org.variantsync.diffdetective.diff.git;
22

3+
import java.util.List;
4+
35
import org.eclipse.jgit.diff.DiffEntry;
46
import org.variantsync.diffdetective.diff.text.TextBasedDiff;
7+
import org.variantsync.diffdetective.util.Source;
58
import org.variantsync.diffdetective.variation.diff.Time;
69
import org.variantsync.diffdetective.variation.diff.VariationDiff; // For Javadoc
7-
import org.variantsync.diffdetective.variation.diff.source.VariationDiffSource;
810

911
/**
1012
* Interface for patches from a git repository.
1113
* A git patch is a {@link TextBasedDiff} from which {@link VariationDiff}s can be created.
1214
*
1315
*/
14-
public interface GitPatch extends VariationDiffSource, TextBasedDiff {
16+
public interface GitPatch extends Source, TextBasedDiff {
1517
/**
1618
* Minimal default implementation of {@link GitPatch}
1719
* @param getDiff The diff in text form.
@@ -41,6 +43,16 @@ public GitPatch shallowClone() {
4143
public String toString() {
4244
return oldFileName + "@ " + getParentCommitHash + " (parent) to " + newFileName + " @ " + getCommitHash + " (child)";
4345
}
46+
47+
@Override
48+
public String getSourceExplanation() {
49+
return "SimpleGitPatch";
50+
}
51+
52+
@Override
53+
public List<Object> getSourceArguments() {
54+
return List.of(getChangeType(), oldFileName(), newFileName(), getCommitHash(), getParentCommitHash());
55+
}
4456
}
4557

4658
/**

src/main/java/org/variantsync/diffdetective/diff/git/PatchDiff.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,4 +123,9 @@ public String toString() {
123123
public GitPatch shallowClone() {
124124
return new GitPatch.SimpleGitPatch(getDiff(), getChangeType(), getFileName(Time.BEFORE), getFileName(Time.AFTER), getCommitHash(), getParentCommitHash());
125125
}
126+
127+
@Override
128+
public String getSourceExplanation() {
129+
return "PatchDiff";
130+
}
126131
}

src/main/java/org/variantsync/diffdetective/examplesearch/ExampleFinder.java

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
import org.variantsync.diffdetective.show.Show;
1212
import org.variantsync.diffdetective.util.Assert;
1313
import org.variantsync.diffdetective.util.IO;
14+
import org.variantsync.diffdetective.util.Source;
1415
import org.variantsync.diffdetective.util.StringUtils;
1516
import org.variantsync.diffdetective.variation.DiffLinesLabel;
1617
import org.variantsync.diffdetective.variation.diff.Time;
@@ -24,7 +25,6 @@
2425
import org.variantsync.diffdetective.variation.diff.serialize.edgeformat.DefaultEdgeLabelFormat;
2526
import org.variantsync.diffdetective.variation.diff.serialize.nodeformat.MappingsDiffNodeFormat;
2627
import org.variantsync.diffdetective.variation.diff.serialize.treeformat.CommitDiffVariationDiffLabelFormat;
27-
import org.variantsync.diffdetective.variation.diff.source.VariationDiffSource;
2828

2929
import java.io.IOException;
3030
import java.nio.file.Path;
@@ -106,14 +106,10 @@ private boolean checkIfExample(Analysis analysis, String localDiff) {
106106
// We do not want a variationDiff for the entire file but only for the local change to have a small example.
107107
final VariationDiff<DiffLinesLabel> localTree;
108108
try {
109-
localTree = VariationDiff.fromDiff(localDiff, new VariationDiffParseOptions(annotationParser, true, true));
109+
localTree = VariationDiff.fromDiff(localDiff, Source.findFirst(variationDiff, GitPatch.class), new VariationDiffParseOptions(annotationParser, true, true));
110110
// Not every local diff can be parsed to a VariationDiff because diffs are unaware of the underlying language (i.e., CPP).
111111
// We want only running examples whose diffs describe entire diff trees for easier understanding.
112-
if (isGoodExample.test(localTree)) {
113-
Assert.assertTrue(variationDiff.getSource() instanceof GitPatch);
114-
final GitPatch variationDiffSource = (GitPatch) variationDiff.getSource();
115-
localTree.setSource(variationDiffSource.shallowClone());
116-
} else {
112+
if (!isGoodExample.test(localTree)) {
117113
return false;
118114
}
119115
} catch (DiffParseException e) {
@@ -149,9 +145,9 @@ public boolean analyzeVariationDiff(Analysis analysis) {
149145
}
150146

151147
private void exportExample(final Analysis analysis, final String tdiff, final VariationDiff<DiffLinesLabel> vdiff, Path outputDir) {
152-
Assert.assertTrue(vdiff.getSource() instanceof GitPatch);
153148
final Repository repo = analysis.getRepository();
154-
final GitPatch patch = (GitPatch) vdiff.getSource();
149+
final GitPatch patch = Source.findFirst(vdiff, GitPatch.class);
150+
Assert.assertNotNull(patch);
155151
outputDir = outputDir.resolve(Path.of(repo.getRepositoryName() + "_" + patch.getCommitHash()));
156152
final String filename = patch.getFileName(Time.AFTER);
157153

@@ -185,8 +181,8 @@ private void exportExample(final Analysis analysis, final String tdiff, final Va
185181
}
186182

187183
static String getDiff(final VariationDiff<?> tree) {
188-
final VariationDiffSource source = tree.getSource();
189-
Assert.assertTrue(source instanceof TextBasedDiff);
190-
return ((TextBasedDiff) source).getDiff();
184+
TextBasedDiff textBasedDiff = Source.findFirst(tree, TextBasedDiff.class);
185+
Assert.assertNotNull(textBasedDiff);
186+
return textBasedDiff.getDiff();
191187
}
192188
}

src/main/java/org/variantsync/diffdetective/experiments/thesis_bm/ConstructionValidation.java

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import java.io.IOException;
66
import java.io.OutputStreamWriter;
77
import java.io.Writer;
8+
import java.nio.file.Path;
89
import java.util.HashMap;
910
import java.util.HashSet;
1011
import java.util.LinkedHashMap;
@@ -42,6 +43,7 @@
4243
import org.variantsync.diffdetective.variation.diff.Time;
4344
import org.variantsync.diffdetective.variation.diff.filter.VariationDiffFilter;
4445
import org.variantsync.diffdetective.variation.diff.parse.VariationDiffParser;
46+
import org.variantsync.diffdetective.variation.tree.source.GitSource;
4547
import org.variantsync.functjonal.category.InplaceSemigroup;
4648
import org.variantsync.functjonal.map.MergeMap;
4749

@@ -375,6 +377,7 @@ private void counts(VariationDiff<DiffLinesLabel> tree, VariationDiffStatistics
375377
}
376378

377379
private VariationDiff<DiffLinesLabel> parseVariationTree(Analysis analysis, RevCommit commit) throws IOException, DiffParseException {
380+
String fileName = analysis.getCurrentPatch().getFileName(AFTER);
378381
try (BufferedReader afterFile =
379382
new BufferedReader(
380383
/*
@@ -386,10 +389,14 @@ private VariationDiff<DiffLinesLabel> parseVariationTree(Analysis analysis, RevC
386389
GitDiffer.getBeforeFullFile(
387390
analysis.getRepository(),
388391
commit,
389-
analysis.getCurrentPatch().getFileName(AFTER)),
392+
fileName),
390393
0xfeff)) // BOM, same as GitDiffer.BOM_PATTERN
391394
) {
392-
return VariationDiffParser.createVariationTree(afterFile, analysis.getRepository().getParseOptions().variationDiffParseOptions());
395+
return VariationDiffParser.createVariationTree(
396+
afterFile,
397+
new GitSource(analysis.getRepository(), commit.getId().name(), Path.of(fileName)),
398+
analysis.getRepository().getParseOptions().variationDiffParseOptions()
399+
);
393400
}
394401
}
395402

src/main/java/org/variantsync/diffdetective/experiments/thesis_es/UnparseAnalysis.java

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import org.variantsync.diffdetective.util.CSV;
1010
import org.variantsync.diffdetective.util.FileUtils;
1111
import org.variantsync.diffdetective.util.IO;
12+
import org.variantsync.diffdetective.util.Source;
1213
import org.variantsync.diffdetective.util.StringUtils;
1314
import org.variantsync.diffdetective.variation.DiffLinesLabel;
1415
import org.variantsync.diffdetective.variation.VariationUnparser;
@@ -17,7 +18,6 @@
1718
import org.variantsync.diffdetective.variation.diff.construction.JGitDiff;
1819
import org.variantsync.diffdetective.variation.diff.parse.VariationDiffParseOptions;
1920
import org.variantsync.diffdetective.variation.tree.VariationTree;
20-
import org.variantsync.diffdetective.variation.tree.source.VariationTreeSource;
2121

2222
public class UnparseAnalysis implements Analysis.Hooks {
2323

@@ -215,7 +215,7 @@ public static String removeWhitespace(String string, boolean diff) {
215215
public static String parseUnparseTree(String text, VariationDiffParseOptions option) {
216216
String temp = "b";
217217
try {
218-
VariationTree<DiffLinesLabel> tree = VariationTree.fromText(text, VariationTreeSource.Unknown, option);
218+
VariationTree<DiffLinesLabel> tree = VariationTree.fromText(text, Source.Unknown, option);
219219
temp = VariationUnparser.unparseTree(tree);
220220
} catch (Exception e) {
221221
e.printStackTrace();
@@ -226,7 +226,7 @@ public static String parseUnparseTree(String text, VariationDiffParseOptions opt
226226
public static String parseUnparseDiff(String textDiff, VariationDiffParseOptions option) {
227227
String temp = "b";
228228
try {
229-
VariationDiff<DiffLinesLabel> diff = VariationDiff.fromDiff(textDiff, option);
229+
VariationDiff<DiffLinesLabel> diff = VariationDiff.fromDiff(textDiff, Source.Unknown, option);
230230
temp = VariationUnparser.unparseDiff(diff);
231231
} catch (Exception e) {
232232
e.printStackTrace();

src/main/java/org/variantsync/diffdetective/experiments/views/result/ViewEvaluation.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ public String toCSV(String delimiter) {
8585
// repo.getRepositoryName(),
8686
commit,
8787
file,
88-
relevance.getFunctionName(),
89-
// getQueryArguments(),
88+
relevance.getSourceExplanation(),
89+
// relevance.getSourceArguments(),
9090
msNaive,
9191
msOptimized,
9292
diffStatistics.nodeCount,

src/main/java/org/variantsync/diffdetective/load/GitLoader.java

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
package org.variantsync.diffdetective.load;
22

33
import net.lingala.zip4j.ZipFile;
4-
import net.lingala.zip4j.exception.ZipException;
54
import org.apache.commons.io.FilenameUtils;
65
import org.eclipse.jgit.api.Git;
76
import org.eclipse.jgit.api.errors.GitAPIException;
@@ -86,10 +85,9 @@ public static Git fromZip(Path pathToZip) {
8685
return fromDirectory(unzippedRepoName);
8786
}
8887

89-
try {
90-
ZipFile zipFile = new ZipFile(pathToZip.toFile());
88+
try (ZipFile zipFile = new ZipFile(pathToZip.toFile())) {
9189
zipFile.extractAll(targetDir.toString());
92-
} catch (ZipException e) {
90+
} catch (IOException e) {
9391
Logger.warn("Failed to extract git repo from {} to {}", pathToZip, targetDir);
9492
return null;
9593
}

0 commit comments

Comments
 (0)