Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
d578f69
Port DuplicateCheck.similarity to StringSimilarity and add tests
TheYorouzoya Aug 15, 2025
022bcd4
Add ICORE2023 Rank Data To Resources
TheYorouzoya Aug 20, 2025
5bacb29
Add Core Classes For ICORE Rank Lookup
TheYorouzoya Aug 22, 2025
e035e0e
Integrate ICORE Rank Lookup Feature Into The GUI
TheYorouzoya Aug 26, 2025
3a36afe
Merge branch 'main' into add-ICORE-ranking-support
TheYorouzoya Aug 26, 2025
a7b1b94
Add Missing Localization Keys and Fix Broken Test
TheYorouzoya Aug 27, 2025
33ed0db
Use List.of() and fix grammar
koppor Aug 28, 2025
806309e
Reorder fields
koppor Aug 28, 2025
5330632
Fix unsupported operation exception
koppor Aug 28, 2025
b092aa9
Rename field
koppor Aug 28, 2025
52d3acd
Merge branch 'main' into add-ICORE-ranking-support
koppor Aug 28, 2025
75d2b47
Hotfix: calling of publish.yml
koppor Aug 28, 2025
45a1d10
Port DuplicateCheck.similarity to StringSimilarity and add tests
TheYorouzoya Aug 15, 2025
290cf6c
Add ICORE2023 Rank Data To Resources
TheYorouzoya Aug 20, 2025
93da09f
Add Core Classes For ICORE Rank Lookup
TheYorouzoya Aug 22, 2025
cbd4dda
Fix Merge Conflict From Upstream Fetch
TheYorouzoya Aug 26, 2025
2a1a73e
Add Missing Localization Keys and Fix Broken Test
TheYorouzoya Aug 27, 2025
6747ad1
Merged changes to FieldFactory and StandardField
TheYorouzoya Aug 28, 2025
ff5c144
Add Minor Fixes, Documentation, and Refactor
TheYorouzoya Aug 28, 2025
89c6600
Revert "Hotfix: calling of publish.yml"
TheYorouzoya Aug 28, 2025
c63e2da
Remove duplicate line in CHANGELOG
TheYorouzoya Aug 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 2 additions & 28 deletions jablib/src/main/java/org/jabref/logic/database/DuplicateCheck.java
Original file line number Diff line number Diff line change
Expand Up @@ -289,9 +289,10 @@ public static double correlateByWords(final String s1, final String s2) {
final String[] w1 = s1.split("\\s");
final String[] w2 = s2.split("\\s");
final int n = Math.min(w1.length, w2.length);
final StringSimilarity match = new StringSimilarity();
int misses = 0;
for (int i = 0; i < n; i++) {
double corr = similarity(w1[i], w2[i]);
double corr = match.similarity(w1[i], w2[i]);
if (corr < 0.75) {
misses++;
}
Expand All @@ -300,33 +301,6 @@ public static double correlateByWords(final String s1, final String s2) {
return 1 - missRate;
}

/**
* Calculates the similarity (a number within 0 and 1) between two strings.
* http://stackoverflow.com/questions/955110/similarity-string-comparison-in-java
*/
private static double similarity(final String first, final String second) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find :)

final String longer;
final String shorter;

if (first.length() < second.length()) {
longer = second;
shorter = first;
} else {
longer = first;
shorter = second;
}

final int longerLength = longer.length();
// both strings are zero length
if (longerLength == 0) {
return 1.0;
}
final double distanceIgnoredCase = new StringSimilarity().editDistanceIgnoreCase(longer, shorter);
final double similarity = (longerLength - distanceIgnoredCase) / longerLength;
LOGGER.trace("Longer string: {} Shorter string: {} Similarity: {}", longer, shorter, similarity);
return similarity;
}

/**
* Checks if the two entries represent the same publication.
*/
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package org.jabref.logic.icore;

import java.util.Optional;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ConferenceAcronymExtractor {
// Regex that'll extract the string within the first deepest set of parentheses
// A slight modification of: https://stackoverflow.com/a/17759264
private static final Pattern PATTERN = Pattern.compile("\\(([^()]*)\\)");

public static Optional<String> extract(String input) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method lacks input validation for null parameter which could lead to NullPointerException. While Optional return is good, the input should be validated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NonNull jspecify annotation is OK

Please add JavaDoc.

Matcher matcher = PATTERN.matcher(input);

if (matcher.find()) {
String match = matcher.group(1).strip();
if (!match.isEmpty()) {
return Optional.of(match);
}
}

return Optional.empty();
}
}
138 changes: 138 additions & 0 deletions jablib/src/main/java/org/jabref/logic/icore/ConferenceRepository.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
package org.jabref.logic.icore;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Map;
import java.util.Optional;

import org.jabref.logic.JabRefException;
import org.jabref.logic.util.strings.StringSimilarity;
import org.jabref.model.icore.ConferenceEntry;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVRecord;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* A Repository that loads and stores the latest ICORE Conference Ranking Data and allows lookups using a conference's
* acronym or title.
* <p>
* The ranking data is sourced from <a href="https://portal.core.edu.au/conf-ranks/">the ICORE Conference Ranking Portal</a>.
* Since the website does not expose an API endpoint to fetch this data programmatically, it must be manually exported
* from the website and stored as a resource. This means that when new ranking data is released, the old data must be
* replaced and the <code>ICORE_RANK_DATA_FILE</code> variable must be modified to point to the new data file.
*/
public class ConferenceRepository {
private static final Logger LOGGER = LoggerFactory.getLogger(ConferenceRepository.class);
private static final String ICORE_RANK_DATA_FILE = "/icore/ICORE2023.csv";

private final Map<String, ConferenceEntry> acronymToConference = new HashMap<>();
private final Map<String, ConferenceEntry> titleToConference = new HashMap<>();

public ConferenceRepository() throws JabRefException {
InputStream inputStream = getClass().getResourceAsStream(ICORE_RANK_DATA_FILE);

if (inputStream == null) {
throw new JabRefException("ICORE rank data file not found in resources");
}

loadConferenceDataFromInputStream(inputStream);
}

// Constructor to allow loading in test data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Constructor to allow loading in test data
/// Constructor to allow loading in test data

JavaDoc in MArkdown is ///

public ConferenceRepository(InputStream testFileInputStream) throws JabRefException {
loadConferenceDataFromInputStream(testFileInputStream);
}

private void loadConferenceDataFromInputStream(InputStream inputStream) throws JabRefException {
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));

try (inputStream; reader) {
Iterable<CSVRecord> records = CSVFormat.DEFAULT.builder()
.setHeader()
.setSkipHeaderRecord(true)
.get()
.parse(reader);

for (CSVRecord record : records) {
String id = record.get("Id").strip();
String title = record.get("Title").strip().toLowerCase();
String acronym = record.get("Acronym").strip().toUpperCase();
String rank = record.get("Rank").strip();

if (id.isEmpty() || title.isEmpty() || acronym.isEmpty() || rank.isEmpty()) {
LOGGER.warn("Missing fields in row in ICORE rank data: {}", record);
continue;
}

ConferenceEntry conferenceEntry = new ConferenceEntry(id, title, acronym, rank);
acronymToConference.put(acronym, conferenceEntry);
titleToConference.put(title, conferenceEntry);
}
} catch (IOException e) {
throw new JabRefException("I/O Error while reading ICORE data from resource", e);
}
}

public Optional<ConferenceEntry> getConferenceFromAcronym(String acronym) {
String query = acronym.strip().toUpperCase();

ConferenceEntry conference = acronymToConference.get(query);

if (conference == null) {
return Optional.empty();
}

return Optional.of(conference);
}

public Optional<ConferenceEntry> getConferenceFromBookTitle(String bookTitle) {
String query = bookTitle.strip().toLowerCase();

// Lucky path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment does not add new information and can be plainly derived from the code. It should be removed as it doesn't provide additional context or reasoning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Lucky path

ConferenceEntry conference = titleToConference.get(query);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment 'Lucky path' doesn't add any new information and can be derived from the code itself. Comments should provide additional context or reasoning.

if (conference != null) {
return Optional.of(conference);
}

String bestMatch = fuzzySearchConferenceTitles(query);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

if (bestMatch.isEmpty()) {
return Optional.empty();
}

conference = titleToConference.get(bestMatch);

return Optional.of(conference);
}

/**
* Searches all conference titles for the given query string using {@link StringSimilarity#similarity} as a matcher.
* <p>
* The threshold for matching is set at 0.9. This function will always return the conference title with the highest
* similarity rating.
*
* @param query The query string to be searched
* @return The conference title, if found. Otherwise, an empty string is returned.
*/
private String fuzzySearchConferenceTitles(String query) {
String bestMatch = "";
double bestSimilarity = 0.0;
final double FUZZY_SEARCH_THRESHOLD = 0.9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract to private static final in the class.

StringSimilarity matcher = new StringSimilarity();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, this can also be a private static final


for (String conferenceTitle : titleToConference.keySet()) {
double similarity = matcher.similarity(query, conferenceTitle);
if (similarity >= FUZZY_SEARCH_THRESHOLD && similarity > bestSimilarity) {
bestMatch = conferenceTitle;
bestSimilarity = similarity;
}
}

return bestMatch;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,12 @@
import java.util.Locale;

import info.debatty.java.stringsimilarity.Levenshtein;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class StringSimilarity {
private static final Logger LOGGER = LoggerFactory.getLogger(StringSimilarity.class);

private final Levenshtein METRIC_DISTANCE = new Levenshtein();
// edit distance threshold for entry title comparison
private final int METRIC_THRESHOLD = 4;
Expand All @@ -24,4 +28,31 @@ public double editDistanceIgnoreCase(String a, String b) {
// TODO: Locale is dependent on the language of the strings. English is a good denominator.
return METRIC_DISTANCE.distance(a.toLowerCase(Locale.ENGLISH), b.toLowerCase(Locale.ENGLISH));
}

/**
* Calculates the similarity (a number within 0 and 1) between two strings.
* http://stackoverflow.com/questions/955110/similarity-string-comparison-in-java
*/
public double similarity(final String first, final String second) {
final String longer;
final String shorter;

if (first.length() < second.length()) {
longer = second;
shorter = first;
} else {
longer = first;
shorter = second;
}

final int longerLength = longer.length();
// both strings are zero length
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is trivial and can be derived directly from the code condition (longerLength == 0). It should be removed as it doesn't add new information.

if (longerLength == 0) {
Comment on lines +49 to +50
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is redundant as it simply restates what the code clearly shows. The comment doesn't provide additional information or reasoning.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment was added there by the original author of the code, I've merely ported it with the necessary modifications. That said, I will contest this review as the lines

final int longerLength = longer.length();
// both strings are zero length
if (longerLength == 0) {
    return 1.0;
}

do not explicitly state, on their own, that the two input strings are equal. Further, the comment is present in the original Stack Overflow post here. That being said, if you're adamant about it, I don't mind changing this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take the bot meanings with care, they are not always that good

return 1.0;
}
final double distanceIgnoredCase = editDistanceIgnoreCase(longer, shorter);
final double similarity = (longerLength - distanceIgnoredCase) / longerLength;
LOGGER.trace("Longer string: {} Shorter string: {} Similarity: {}", longer, shorter, similarity);
return similarity;
}
}
17 changes: 17 additions & 0 deletions jablib/src/main/java/org/jabref/model/icore/ConferenceEntry.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package org.jabref.model.icore;

/**
* A Conference Entry built from a subset of fields in the ICORE Ranking data
*/
Comment on lines +3 to +5
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment merely restates what is obvious from the code and doesn't provide additional information about the purpose or constraints of the record.

public record ConferenceEntry(
String id,
String title,
String acronym,
String rank
) {
private final static String URL_PREFIX = "https://portal.core.edu.au/conf-ranks/";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect order of modifiers. According to Java conventions and effective Java principles, it should be 'private static final' instead of 'private final static'.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix that in the next commit.


public String getICOREURL() {
return URL_PREFIX + id;
}
}
Loading
Loading