Add #12810: Implement escaping for keyword separators (FKA #12888) #13583

miguel-cordoba · 2025-07-24T02:17:52Z

Addresses unresolved previous review comments

Closes #12810

Steps to test

see 12888

Mandatory checks

I own the copyright of the code submitted and I license it under the MIT license
Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
Tests created for changes (if applicable)
Manually tested changed features in running JabRef (always required)
Screenshots added in PR description (if change is visible to the user)
Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

…bRef#12888) - Addresses unresolved previous reviews: JabRef#12888 (comment)

…bRef#12888) - Addresses unresolved previous reviews: JabRef#12888 (comment) - Adds CHANGELOG.md entry

jablib/src/main/java/org/jabref/model/entry/KeywordList.java

…r-issue-12810

miguel-cordoba · 2025-07-24T12:59:04Z

Hey there! I'm completely new to jabref and open source in general and I am exploring how things work.

I noticed that when I enter keywords for a TechReport entry for example: Ensenada, then Ensenada, Baja California, and then again Ensenada, the app seems to cut the word unexpectedly, like only showing "E". I see that we check if the keywordList already contains the current keyword (KeywordList:140), but I guess this might interfere with the desired behaviour described here.

Really looking forward for your input!

jablib/src/main/java/org/jabref/model/entry/KeywordList.java

koppor · 2025-07-28T20:07:20Z

jablib/src/main/java/org/jabref/model/entry/KeywordList.java

+        StringBuilder currentToken = new StringBuilder();
+        boolean isEscaping = false;
+
+        for (int i = 0; i < keywordString.length(); i++) {


Note this is #12888 (comment) :)

Thanks for your review! As I dived into the project I saw that this loop had not been addressed. So I used @ungerts proposed comment, since it is readable and solves the issue. Am I missing something?

This is very OK. I like links to provide context.

Other reviewers might think: why a for loop etc.

Ooh ok perfect :)

koppor

Tests for org.jabref.model.entry.Keyword#getSubchainAsString missing

Should have different string representations: Keyword > Keyword vs. Keyword \> Keyword

Think of "round trip"

koppor · 2025-07-28T21:30:42Z

jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java

@@ -115,4 +115,41 @@ void mergeTwoDistinctKeywordsShouldReturnTheTwoKeywordsMerged() {
    void mergeTwoListsOfKeywordsShouldReturnTheKeywordsMerged() {
        assertEquals(new KeywordList("Figma", "Adobe", "JabRef", "Eclipse", "JetBrains"), KeywordList.merge("Figma, Adobe, JetBrains, Eclipse", "Adobe, JabRef", ','));
    }
+
+    @Test
+    void parseKeywordWithEscapedDelimiterDoesNotSplitKeyword() {


Convert to ParameterizedTest

I gathered all the previous tests in a Parameterized one, I also added roundtrip tests on KeyworList and Keyword. Thank you so much for your review!

trag-bot · 2025-07-29T16:13:45Z

jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java

+        String serialized = parsed.toString();
+        KeywordList reparsed = KeywordList.parse(serialized, ',', '>');
+
+        assertEquals(parsed.toString(), reparsed.toString());


Comparing string representations instead of actual objects may hide structural differences. Should directly compare KeywordList objects to ensure complete equality.

I think this doesnt apply here?

Add a comment above the statement to indicate that the toSTring() fucntionality is tested.

Morevoer, test for

Suggested change

assertEquals(parsed.toString(), reparsed.toString());

assertEquals(original, parsed.toString());

No need for reparsed.

jablib/src/test/java/org/jabref/model/entry/KeywordTest.java

koppor · 2025-07-29T17:34:51Z

jablib/src/test/java/org/jabref/model/entry/KeywordTest.java

+
+    @ParameterizedTest
+    @ValueSource(strings = {
+            "Keyword > Keyword",


I would excpect all cases from provideParseKeywordCases there.

koppor · 2025-07-29T17:36:44Z

jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java

+
+    @ParameterizedTest
+    @ValueSource(strings = {
+            "Keyword > Keyword",


Test all of provideParseKeywordCases

When extending this test it was revealed that the current implementation does not handle escaping as desired at deserializing, which made me extend the logic on Keyword#getSubchainAsString

- extends unitTests

…r-issue-12810

koppor · 2025-07-31T13:58:24Z

Thank you for your insights! I'll have more free time in the coming weeks and Im happy to continue working on this.

Glad to hear - and sorry that we did not see that it will escalate when looking at the issue at first 😅. Looking forward.

…pPreservesStructure fails

ryan-carpenter · 2025-08-05T09:08:27Z

In case it is not clear: Escaping should only apply to the keyword separator character. Keywords already work properly when there is no conflict between the keywords and the separator.

ryan-carpenter · 2025-08-05T09:14:36Z

There should NOT be any auto escaping.

I think this is right for supporting escaping. However, JabRef should check on import and auto-escape the separator character if contained in individual keyword/phrases. If the imported file type is one that puts each keyword on it's own line (RIS, PubMed nbib), then characters that match the designated separator can be auto-escaped.

…r-issue-12810

…BibTex)

miguel-cordoba · 2025-08-06T16:16:53Z

In case it is not clear: Escaping should only apply to the keyword separator character. Keywords already work properly when there is no conflict between the keywords and the separator.

There should NOT be any auto escaping.

I think this is right for supporting escaping. However, JabRef should check on import and auto-escape the separator character if contained in individual keyword/phrases. If the imported file type is one that puts each keyword on it's own line (RIS, PubMed nbib), then characters that match the designated separator can be auto-escaped.

Thank you, I will keep this in mind

…BibTex)

…r-issue-12810

jablib/src/main/java/org/jabref/model/entry/BibEntry.java

jablib/src/main/java/org/jabref/model/entry/KeywordList.java

…more and better tests.

trag-bot · 2025-08-24T20:21:17Z

jablib/src/main/java/org/jabref/model/entry/Keyword.java

@@ -13,6 +13,7 @@
 */
 public class Keyword extends ChainNode<Keyword> implements Comparable<Keyword> {

+    // Note: {@link org.jabref.model.entry.KeywordList#parse(java.lang.String, java.lang.Character, java.lang.Character) offers configuration, which is not available here


The comment merely states what is visible from the code and doesn't provide additional value or reasoning. It should be removed or enhanced with actual implementation rationale.

trag-bot · 2025-08-24T20:21:18Z

jablib/src/main/java/org/jabref/model/entry/Keyword.java

+    /*
+     * Used for BibTex export, where we need to escape the delimiter with \
+     */


The spelling 'BibTex' in the comment is incorrect according to project standards. It should be 'BibTeX' for consistency in documentation and comments.

ryan-carpenter · 2025-08-29T07:42:22Z

Thanks for this contribution!

Think of "round trip"

I noticed examples showing commas without spaces. This is perfect if it matches the input. If the keywords contain , (as in PubMed records) please keep the space.

trag-bot · 2025-08-31T21:41:23Z

jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java

+    @MethodSource("provideParseKeywordCases")
+    void roundTripPreservesStructure(String original) {
+        KeywordList parsed = KeywordList.oldParse(original, ',', '>');
+        // We need to test the toString() functionality


Comment restates what is obvious from the code and doesn't provide additional value. Such comments should be removed as they create maintenance overhead.

miguel-cordoba · 2025-08-31T21:47:51Z

Sorry for the delay in progressing on this. I started a new job recently and my bandwidth has been limited, but I really want to finish what I started here.

I’ve been experimenting with different approaches. Right now, the changes work reasonably well for UI input, but I have not successfully implemented complete round-trip fidelity for BibTeX serialization(with autoescaping) yet, some edgecases do not pass. Both UI and BibTeX being parsed with the same parsing method in KeyworList.

Now I have started considering using the old parse method for BibTeX parsing and I will be tweaking it further because I feel like that could be a good direction, keeping the internal model exaclty as it comes from .bib but then shown "nice" on the UI (UIparsed and toString()).

Feel free to course correct or give any observation, I am more familiar now with many JabRef concepts but I probably am missing details or specific things like different formats etc.

jabref-machine · 2025-08-31T21:51:10Z

Note that your PR will not be reviewed/accepted until you have gone through the mandatory checks in the description and marked each of them them exactly in the format of [x] (done), [ ] (not done yet) or [/] (not applicable).

trag-bot · 2025-08-31T21:51:19Z

@trag-bot didn't find any issues in the code! ✅✨

jabref-machine · 2025-08-31T21:59:11Z

JUnit tests of jablib are failing. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Source Code Tests / Unit tests (pull_request)" and click on it.

You can then run these tests in IntelliJ to reproduce the failing tests locally. We offer a quick test running howto in the section Final build system checks in our setup guide.

miguel-cordoba and others added 5 commits July 24, 2025 02:24

fixes JabRef#12810: Implement escaping for keyword separators (FKA Ja…

7b5505c

…bRef#12888) - Addresses unresolved previous reviews: JabRef#12888 (comment)

fixes JabRef#12810: Implement escaping for keyword separators (FKA Ja…

44257ad

…bRef#12888) - Addresses unresolved previous reviews: JabRef#12888 (comment) - Adds CHANGELOG.md entry

fixes JabRef#12810: Implement escaping for keyword separators (FKA Ja…

e6c52dd

…bRef#12888) - Addresses unresolved previous reviews: JabRef#12888 (comment) - Adds CHANGELOG.md entry

removes obvious comment, improves CHANGELOG message

d2ad73d

Merge branch 'main' into fix-for-issue-12810

465955a

miguel-cordoba marked this pull request as ready for review July 24, 2025 12:37

trag-bot bot reviewed Jul 24, 2025

View reviewed changes

jablib/src/main/java/org/jabref/model/entry/KeywordList.java Outdated Show resolved Hide resolved

trag-bot bot reviewed Jul 24, 2025

View reviewed changes

jablib/src/main/java/org/jabref/model/entry/KeywordList.java Outdated Show resolved Hide resolved

miguel-cordoba added 2 commits July 24, 2025 14:42

removes obvious comments

4928536

Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…

2ccc24e

…r-issue-12810

miguel-cordoba marked this pull request as draft July 24, 2025 12:43

miguel-cordoba marked this pull request as ready for review July 24, 2025 12:59

tackle List.of() review comment

1111794

trag-bot bot reviewed Jul 24, 2025

View reviewed changes

jablib/src/main/java/org/jabref/model/entry/KeywordList.java Outdated Show resolved Hide resolved

miguel-cordoba and others added 3 commits July 24, 2025 16:40

undo tackle List.of() review comment

83198c0

Merge branch 'main' into fix-for-issue-12810

cc5fa8b

Merge branch 'main' into fix-for-issue-12810

87d1ff6

koppor reviewed Jul 28, 2025

View reviewed changes

Merge branch 'main' into fix-for-issue-12810

3d2785b

koppor requested changes Jul 28, 2025

View reviewed changes

miguel-cordoba added 2 commits July 29, 2025 18:07

adds tests after review

1b455d3

adds trag-bot changes

2b85d0c

trag-bot bot reviewed Jul 29, 2025

View reviewed changes

jablib/src/test/java/org/jabref/model/entry/KeywordTest.java Outdated Show resolved Hide resolved

Merge branch 'main' into fix-for-issue-12810

4b31cd0

koppor reviewed Jul 29, 2025

View reviewed changes

miguel-cordoba added 2 commits July 30, 2025 13:30

- extends Keyword#toString to ensure round-trip integrity

d15f2ef

- extends unitTests

Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…

d77bb5e

…r-issue-12810

- removes autoscaping on keyword#toString -> KeywordListTest#roundTri…

4c5181b

…pPreservesStructure fails

koppor mentioned this pull request Aug 6, 2025

Uncaught java.lang.IndexOutOfBoundsException #13635

Closed

2 tasks

miguel-cordoba and others added 3 commits August 6, 2025 15:41

Merge branch 'main' into fix-for-issue-12810

b02aeea

Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…

eef54b6

…r-issue-12810

Working on new approach for parse/serielide depending on context (UI/…

ffeb78d

…BibTex)

miguel-cordoba and others added 5 commits August 11, 2025 01:50

Merge branch 'main' into fix-for-issue-12810

1f321c2

Merge branch 'main' into fix-for-issue-12810

3e2636f

Working on new approach for parse/serielide depending on context (UI/…

634d302

…BibTex)

Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…

b2d902f

…r-issue-12810

WIP: adds old KeyWordList#parse as #bibtexParse

42710fa

trag-bot bot reviewed Aug 13, 2025

View reviewed changes

jablib/src/main/java/org/jabref/model/entry/BibEntry.java Outdated Show resolved Hide resolved

trag-bot bot reviewed Aug 13, 2025

View reviewed changes

jablib/src/main/java/org/jabref/model/entry/BibEntry.java Outdated Show resolved Hide resolved

Merge branch 'main' into fix-for-issue-12810

4b2a8d8

trag-bot bot reviewed Aug 13, 2025

View reviewed changes

jablib/src/main/java/org/jabref/model/entry/KeywordList.java Show resolved Hide resolved

miguel-cordoba and others added 2 commits August 24, 2025 22:11

Merge branch 'main' into fix-for-issue-12810

a2031ed

WIP: adds KeywordList#bibtexSerialize (autoescapes delimiter). Needs …

af3c50e

…more and better tests.

trag-bot bot reviewed Aug 24, 2025

View reviewed changes

WIP: extends KeywordList#parse and #bibtexSerialize

e955fd1

trag-bot bot reviewed Aug 31, 2025

View reviewed changes

Merge branch 'main' into fix-for-issue-12810

a3d69f2

	assertEquals(parsed.toString(), reparsed.toString());
	assertEquals(original, parsed.toString());

Uh oh!

Add #12810: Implement escaping for keyword separators (FKA #12888) #13583

Are you sure you want to change the base?

Add #12810: Implement escaping for keyword separators (FKA #12888) #13583

Uh oh!

Conversation

miguel-cordoba commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Steps to test

Mandatory checks

Uh oh!

Uh oh!

Uh oh!

miguel-cordoba commented Jul 24, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

koppor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trag-bot bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

koppor commented Jul 31, 2025

Uh oh!

ryan-carpenter commented Aug 5, 2025

Uh oh!

ryan-carpenter commented Aug 5, 2025

Uh oh!

miguel-cordoba commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

trag-bot bot Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

trag-bot bot Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

ryan-carpenter commented Aug 29, 2025

Uh oh!

trag-bot bot Aug 31, 2025

Choose a reason for hiding this comment

Uh oh!

miguel-cordoba commented Aug 31, 2025

Uh oh!

jabref-machine commented Aug 31, 2025

Uh oh!

trag-bot bot commented Aug 31, 2025

Uh oh!

jabref-machine commented Aug 31, 2025

Uh oh!

Uh oh!

miguel-cordoba commented Jul 24, 2025 •

edited

Loading