feat(slr): run duplicate check after finding papers #14226

InAnYan · 2025-11-02T18:34:21Z

Closes https://github.com/JabRef/jabref-issue-melting-pot/issues/1063

Tried to run a duplicate check after SLR.

Remarks:

In general, this should not happen and internally SLR must remove (move, merge, treat) duplicate papers.
I'm not sure that the chosen approach of AutomaticDuplicateRemover is right.

Steps to test

Checkout to the latest main branch
Run SLR with any query (I used for testing: greek, greeks, ancient greeks)
I got around 458 entries

After I run a duplicate finder with clicking "Keep merged", I got 449 entries.

Then:

Checkout to this PR
Run the same SLR
See that you got 449 entries

Mandatory checks

I own the copyright of the code submitted and I license it under the MIT license
I manually tested my changes in running JabRef (always required)
[/] I added JUnit tests for changes (if applicable)
[/] I added screenshots in the PR description (if change is visible to the user)
[/] I described the change in CHANGELOG.md in a way that is understandable for the average user (if change is visible to the user)
I checked the user documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request updating file(s) in https://github.com/JabRef/user-documentation/tree/main/en.

jabref-machine · 2025-11-02T18:43:08Z

Your code currently does not meet JabRef's code guidelines. IntelliJ auto format covers some cases. There seem to be issues with your code style and autoformat configuration. Please reformat your code (Ctrl+Alt+L) and commit, then push.

In special cases, consider using // formatter:off and // formatter:on annotations to allow deviation from the code style.

HoussemNasri · 2025-11-02T21:21:06Z

jablib/src/main/java/org/jabref/logic/crawler/AutomaticDuplicateRemover.java

+        BibDatabase database = databaseContext.getDatabase();
+        List<BibEntry> entries = database.getEntries();
+        List<BibEntry> entriesToRemove = new ArrayList<>();
+        Set<BibEntry> handledEntries = new HashSet<>();


The way the looping is done now, it is guaranteed that you won't "handle" the same pair of entries twice, so the handledEntries set and the conditions are redundant.

feat(slr): run duplicate check after finding papers

fb80202

HoussemNasri reviewed Nov 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(slr): run duplicate check after finding papers #14226

feat(slr): run duplicate check after finding papers #14226

InAnYan commented Nov 2, 2025

Uh oh!

jabref-machine commented Nov 2, 2025

Uh oh!

HoussemNasri Nov 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

feat(slr): run duplicate check after finding papers #14226

Are you sure you want to change the base?

feat(slr): run duplicate check after finding papers #14226

Conversation

InAnYan commented Nov 2, 2025

Steps to test

Mandatory checks

Uh oh!

jabref-machine commented Nov 2, 2025

Uh oh!

HoussemNasri Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants