Skip to content

Update JDK and libraries #109

Open
lfoppiano wants to merge 1 commit intomasterfrom
feature/update-libraries
Open

Update JDK and libraries #109
lfoppiano wants to merge 1 commit intomasterfrom
feature/update-libraries

Conversation

@lfoppiano
Copy link
Copy Markdown
Collaborator

@lfoppiano lfoppiano commented Apr 8, 2026

With the release of Grobid 0.9.0, I took the opportunity to update some libraries, including how grobid is called.

Summary by CodeRabbit

  • Chores

    • Upgraded Gradle build system from 7.2 to 8.10.2
    • Upgraded Java requirement to JDK 21 (LTS) for main application
    • Updated core dependencies for improved stability and compatibility
    • Modernized HTTP client implementation
  • Documentation

    • Updated installation guide to require Java JDK 21 (LTS)
    • Enhanced GROBID setup instructions with version guidance (0.8.x recommended)
    • Changed GROBID configuration parameter from grobidPath to grobidHost

Signed-off-by: Luca Foppiano <luca@foppiano.org>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

The PR upgrades the build system from Gradle 7.2 to 8.10.2, migrates the main project to Java 21 LTS (pubmed-glutton to Java 17), and updates the Shadow plugin to a new maintainer. Dependency versions are modernized, Apache HttpClient is replaced with JDK 11+ HttpClient in GROBID integration, and Gradle tasks are reconfigured with unified JVM argument handling via a shared addOpensArgs list.

Changes

Cohort / File(s) Summary
Gradle Build System Upgrade
build.gradle, gradle/wrapper/gradle-wrapper.properties, settings.gradle
Updated Gradle wrapper to 8.10.2, migrated to Java 21 toolchain, upgraded Shadow plugin to com.gradleup.shadow:shadow-gradle-plugin:8.3.5, modernized dependencies (JUnit 4.13.2, commons-io 2.15.1, log4j-to-slf4j 2.23.1, httpclient 4.5.14), adjusted Woodstox/STaX dependencies, and introduced shared ext.addOpensArgs for unified JVM flags across all JavaExec tasks. Added Foojay toolchain resolver plugin.
pubmed-glutton Build Configuration
pubmed-glutton/build.gradle, pubmed-glutton/gradle/wrapper/gradle-wrapper.properties, pubmed-glutton/Readme.md
Aligned Shadow plugin and Gradle wrapper to 8.10.2, configured Java 17 toolchain (due to Jersey 1.8 / log4j 1.2.x compatibility constraints), updated task syntax, and added build-status warning documenting missing KBEnvironment / KBStagingEnvironment dependencies.
GROBID HTTP Client Refactoring
src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClient.java, src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClientAsync.java
Migrated from Apache HttpClient to JDK 11+ HttpClient with URL encoding, explicit connect/request timeouts, and extracted response parsing into private parseGrobidResponse(InputStream) method. Updated exception handling for InterruptedException and IOException. Changed async callback parameter type from Consumer<GrobidResponseStaxHandler.GrobidResponse> to Consumer<GrobidResponse>.
Documentation Updates
doc/Install.md
Updated Java version requirement from JDK 1.11+ to JDK 21 LTS, added Grobid version guidance (0.8.x stable, 0.7.x\+ compatible), changed GROBID configuration parameter from grobidPath to grobidHost, and documented Foojay resolver behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Gradle hops forward, from seven to eight,
Java twenty-one now sets the state,
HttpClient's new jig replaces the old,
Dependencies aligned, a build story retold! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Update JDK and libraries' accurately reflects the main changes across the pull request: upgrading Java toolchains, Gradle versions, and multiple dependencies.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/update-libraries

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
pubmed-glutton/build.gradle (1)

63-74: Consider prioritizing dependency updates for security.

Several dependencies have known CVEs or are EOL:

  • log4j:1.2.17 - EOL, multiple CVEs
  • jackson-*:2.9.0.pr3 - Pre-release from 2017, known CVEs
  • xstream:1.4.8 - Multiple CVEs

This is documented as a tracked follow-up in pubmed-glutton/Readme.md, but given the severity of some vulnerabilities (especially log4j and Jackson), consider prioritizing this modernization.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pubmed-glutton/build.gradle` around lines 63 - 74, The dependency block pins
insecure/outdated libraries (log4j:1.2.17, jackson-*:2.9.0.pr3, xstream:1.4.8,
etc.); update these artifacts in the build.gradle implementation declarations to
secure supported releases (replace log4j 1.x with Log4j2 or a maintained logging
backend, upgrade
jackson-core/jackson-databind/jackson-annotations/jackson-dataformat-yaml to a
current patched 2.x release, and bump xstream to the latest secure version), run
the test suite and dependency vulnerability scan, and update any code/config
that needs API changes and the project documentation to reflect the new
versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@build.gradle`:
- Line 86: Remove the explicit dependency pin for jakarta.validation-api:2.0.2
in the build.gradle by deleting or replacing the implementation entry for
'jakarta.validation:jakarta.validation-api:2.0.2' so the project uses the
Dropwizard 4-managed version (3.0.2); if you must declare it explicitly, update
the coordinate to 'jakarta.validation:jakarta.validation-api:3.0.2' to match
Dropwizard 4's managed version.

In `@doc/Install.md`:
- Line 79: Typo in the documentation: replace the misspelled word "strating"
with "starting" in the sentence that mentions changing the `port` used by
GROBID; edit the Install.md line containing "You can change the `port` used by
GROBID when strating the docker container" to read "starting the docker
container" so the sentence is correct.

In `@settings.gradle`:
- Around line 1-3: Update the Foojay toolchain resolver plugin version used in
the Gradle build: locate the plugins block that applies id
'org.gradle.toolchains.foojay-resolver-convention' and change its version from
'0.8.0' to '1.0.0' so the plugins declaration reads the newer stable release.

In `@src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClient.java`:
- Around line 96-105: parseGrobidResponse creates an XMLStreamReader2 via
inputFactory and calls StaxUtils.traverse(reader, handler) but never closes the
reader; update parseGrobidResponse to ensure the XMLStreamReader2 (reader) is
closed after parsing (e.g. use try-with-resources or a try/finally that calls
reader.close()), preserving the existing ServiceException handling (wrap
XMLStreamException as before) and still returning handler.getResponse();
reference symbols: parseGrobidResponse, XMLStreamReader2, inputFactory,
StaxUtils.traverse, GrobidResponseStaxHandler.

In `@src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClientAsync.java`:
- Around line 94-103: In parseGrobidResponse, the XMLStreamReader2 created by
inputFactory.createXMLStreamReader(body) is never closed; update
parseGrobidResponse to ensure the XMLStreamReader2 (reader) is closed in a
finally block (or equivalent) after StaxUtils.traverse completes — e.g., assign
the reader to a variable before traversing, call handler.getResponse(), and in
finally call reader.close() (catching any XMLStreamException from close and
wrapping or logging as a ServiceException) so resources are always released;
reference symbols: parseGrobidResponse, XMLStreamReader2 reader,
inputFactory.createXMLStreamReader, StaxUtils.traverse,
GrobidResponseStaxHandler.
- Around line 77-92: The async branch currently throws exceptions inside the
CompletableFuture chain but the method doesn't return that future, so callers
cannot observe failures; change the method in GrobidClientAsync to return the
CompletableFuture<Void> produced by the
httpClient.sendAsync(...).thenAccept(...).exceptionally(...) chain (i.e., update
the method signature to return CompletableFuture<Void> and return the result of
the httpClient.sendAsync call) so that any ServiceException or
CompletionException propagates to callers for handling; ensure you reference the
httpClient.sendAsync(...) -> thenAccept(response -> { ... }) ->
exceptionally(...) chain when making the change.

---

Nitpick comments:
In `@pubmed-glutton/build.gradle`:
- Around line 63-74: The dependency block pins insecure/outdated libraries
(log4j:1.2.17, jackson-*:2.9.0.pr3, xstream:1.4.8, etc.); update these artifacts
in the build.gradle implementation declarations to secure supported releases
(replace log4j 1.x with Log4j2 or a maintained logging backend, upgrade
jackson-core/jackson-databind/jackson-annotations/jackson-dataformat-yaml to a
current patched 2.x release, and bump xstream to the latest secure version), run
the test suite and dependency vulnerability scan, and update any code/config
that needs API changes and the project documentation to reflect the new
versions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ad4996b4-07ad-445d-98d7-b1d3741bcf7f

📥 Commits

Reviewing files that changed from the base of the PR and between acda234 and c2eee7a.

📒 Files selected for processing (9)
  • build.gradle
  • doc/Install.md
  • gradle/wrapper/gradle-wrapper.properties
  • pubmed-glutton/Readme.md
  • pubmed-glutton/build.gradle
  • pubmed-glutton/gradle/wrapper/gradle-wrapper.properties
  • settings.gradle
  • src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClient.java
  • src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClientAsync.java

Comment thread build.gradle
@@ -80,19 +85,31 @@ dependencies {

implementation 'jakarta.validation:jakarta.validation-api:2.0.2'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check what validation API version Dropwizard 4 brings transitively

# Search for validation API usage in the codebase
echo "=== Validation imports in codebase ==="
rg -n "import.*validation" --type java | head -30

Repository: kermitt2/biblio-glutton

Length of output: 345


🌐 Web query:

What jakarta.validation-api version does Dropwizard 4.0.0 require?

💡 Result:

Dropwizard 4.0.0 (via its dropwizard-dependencies BOM) pins jakarta.validation:jakarta.validation-api to version 3.0.2. (central.sonatype.com)

Citations:


Remove explicit jakarta.validation-api version pin to align with Dropwizard 4.

The explicit pin to jakarta.validation-api:2.0.2 overrides Dropwizard 4.0.0's managed version 3.0.2. While version 2.0.2 does use the Jakarta namespace, Dropwizard 4.x manages this dependency explicitly and pinning to an older version may introduce compatibility issues. Remove this explicit dependency and let Dropwizard manage the validation API version, or upgrade to 3.0.2 if an explicit version is required.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@build.gradle` at line 86, Remove the explicit dependency pin for
jakarta.validation-api:2.0.2 in the build.gradle by deleting or replacing the
implementation entry for 'jakarta.validation:jakarta.validation-api:2.0.2' so
the project uses the Dropwizard 4-managed version (3.0.2); if you must declare
it explicitly, update the coordinate to
'jakarta.validation:jakarta.validation-api:3.0.2' to match Dropwizard 4's
managed version.

Comment thread doc/Install.md
* First download and install GROBID as indicated in the [documentation](https://grobid.readthedocs.io/en/latest/Install-Grobid/), normally as a docker image to take advantage of Deep Learning models for more accurate parsing of bibliographical references. **Recommended Grobid version: the latest stable 0.8.x release** (see [Grobid releases](https://github.com/kermitt2/grobid/releases)). biblio-glutton communicates with Grobid only via HTTP (the `/api/isalive` and `/api/processCitation` endpoints), so any Grobid 0.7.x or later release is API-compatible.

* Start the service as documented [here](https://grobid.readthedocs.io/en/latest/Grobid-service/). You can change the `port` used by GROBID when strating the docker container, or by updating the service config file under `grobid/grobid-home/config/grobid.yaml`.
* Start the service as documented [here](https://grobid.readthedocs.io/en/latest/Grobid-service/). You can change the `port` used by GROBID when strating the docker container, or by updating the service config file under `grobid/grobid-home/config/grobid.yaml`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: "strating" should be "starting".

📝 Fix typo
-* Start the service as documented [here](https://grobid.readthedocs.io/en/latest/Grobid-service/). You can change the `port` used by GROBID when strating the docker container, or by updating the service config file under `grobid/grobid-home/config/grobid.yaml`.
+* Start the service as documented [here](https://grobid.readthedocs.io/en/latest/Grobid-service/). You can change the `port` used by GROBID when starting the docker container, or by updating the service config file under `grobid/grobid-home/config/grobid.yaml`.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* Start the service as documented [here](https://grobid.readthedocs.io/en/latest/Grobid-service/). You can change the `port` used by GROBID when strating the docker container, or by updating the service config file under `grobid/grobid-home/config/grobid.yaml`.
* Start the service as documented [here](https://grobid.readthedocs.io/en/latest/Grobid-service/). You can change the `port` used by GROBID when starting the docker container, or by updating the service config file under `grobid/grobid-home/config/grobid.yaml`.
🧰 Tools
🪛 LanguageTool

[grammar] ~79-~79: Ensure spelling is correct
Context: ...n change the port used by GROBID when strating the docker container, or by updating th...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/Install.md` at line 79, Typo in the documentation: replace the misspelled
word "strating" with "starting" in the sentence that mentions changing the
`port` used by GROBID; edit the Install.md line containing "You can change the
`port` used by GROBID when strating the docker container" to read "starting the
docker container" so the sentence is correct.

Comment thread settings.gradle
Comment on lines +1 to +3
plugins {
id 'org.gradle.toolchains.foojay-resolver-convention' version '0.8.0'
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

What is the latest version of org.gradle.toolchains.foojay-resolver-convention plugin?

💡 Result:

The latest version of the org.gradle.toolchains.foojay-resolver-convention Gradle plugin is 1.0.0 (released May 19, 2025). [1]

Usage:

plugins {
  id("org.gradle.toolchains.foojay-resolver-convention") version "1.0.0"
}

Upgrade Foojay resolver plugin to version 1.0.0.

Good addition—the Foojay toolchain resolver enables automatic JDK provisioning. However, version 0.8.0 is outdated; the latest stable version is 1.0.0 (released May 2025). Update to:

plugins {
    id 'org.gradle.toolchains.foojay-resolver-convention' version '1.0.0'
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@settings.gradle` around lines 1 - 3, Update the Foojay toolchain resolver
plugin version used in the Gradle build: locate the plugins block that applies
id 'org.gradle.toolchains.foojay-resolver-convention' and change its version
from '0.8.0' to '1.0.0' so the plugins declaration reads the newer stable
release.

Comment on lines +96 to 105
private GrobidResponse parseGrobidResponse(InputStream body) throws ServiceException {
try {
XMLStreamReader2 reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
GrobidResponseStaxHandler handler = new GrobidResponseStaxHandler();
StaxUtils.traverse(reader, handler);
return handler.getResponse();
} catch (XMLStreamException e) {
throw new ServiceException(502, "Cannot parse the response from GROBID", e);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

XMLStreamReader is not closed after parsing.

The XMLStreamReader2 created at line 98 is never explicitly closed. While the underlying InputStream is closed by the caller (line 85-87), the XMLStreamReader holds its own parser resources that should be released. Per the relevant snippet in StaxUtils.java, traverse() does not close the reader.

🛡️ Proposed fix to close the reader
 private GrobidResponse parseGrobidResponse(InputStream body) throws ServiceException {
+    XMLStreamReader2 reader = null;
     try {
-        XMLStreamReader2 reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
+        reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
         GrobidResponseStaxHandler handler = new GrobidResponseStaxHandler();
         StaxUtils.traverse(reader, handler);
         return handler.getResponse();
     } catch (XMLStreamException e) {
         throw new ServiceException(502, "Cannot parse the response from GROBID", e);
+    } finally {
+        if (reader != null) {
+            try {
+                reader.close();
+            } catch (XMLStreamException ignored) {
+                // Best-effort cleanup
+            }
+        }
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private GrobidResponse parseGrobidResponse(InputStream body) throws ServiceException {
try {
XMLStreamReader2 reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
GrobidResponseStaxHandler handler = new GrobidResponseStaxHandler();
StaxUtils.traverse(reader, handler);
return handler.getResponse();
} catch (XMLStreamException e) {
throw new ServiceException(502, "Cannot parse the response from GROBID", e);
}
}
private GrobidResponse parseGrobidResponse(InputStream body) throws ServiceException {
XMLStreamReader2 reader = null;
try {
reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
GrobidResponseStaxHandler handler = new GrobidResponseStaxHandler();
StaxUtils.traverse(reader, handler);
return handler.getResponse();
} catch (XMLStreamException e) {
throw new ServiceException(502, "Cannot parse the response from GROBID", e);
} finally {
if (reader != null) {
try {
reader.close();
} catch (XMLStreamException ignored) {
// Best-effort cleanup
}
}
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClient.java` around
lines 96 - 105, parseGrobidResponse creates an XMLStreamReader2 via inputFactory
and calls StaxUtils.traverse(reader, handler) but never closes the reader;
update parseGrobidResponse to ensure the XMLStreamReader2 (reader) is closed
after parsing (e.g. use try-with-resources or a try/finally that calls
reader.close()), preserving the existing ServiceException handling (wrap
XMLStreamException as before) and still returning handler.getResponse();
reference symbols: parseGrobidResponse, XMLStreamReader2, inputFactory,
StaxUtils.traverse, GrobidResponseStaxHandler.

Comment on lines +77 to 92
httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream())
.thenAccept(response -> {
if (response.statusCode() != HttpURLConnection.HTTP_OK) {
throw new ServiceException(502, "Error while connecting to GROBID service. Error code: " + response.statusCode());
}
}
}

@Override
public void failed(Exception ex) {
throw new ServiceException(502, "Cannot parse the response from GROBID", ex);
}

@Override
public void cancelled() {
throw new ServiceException(502, "Cannot parse the response from GROBID");
}
});

try (InputStream body = response.body()) {
callback.accept(parseGrobidResponse(body));
} catch (IOException e) {
throw new ServiceException(502, "Cannot read the response from GROBID", e);
}
})
.exceptionally(ex -> {
LOGGER.warn("Async GROBID call failed", ex);
throw new ServiceException(502, "Async GROBID call failed", ex);
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Async error handling: thrown exception is lost since the CompletableFuture isn't returned.

The exceptionally() stage throws ServiceException, but since the CompletableFuture isn't returned or awaited, the caller has no way to observe failures. The exception gets wrapped in a CompletionException and effectively goes nowhere.

Consider either:

  1. Return the CompletableFuture<Void> so callers can handle errors, or
  2. Pass an error callback alongside the success callback.
🔧 Option 1: Return the future
-    public void processCitation(String rawCitation, String consolidation, Consumer<GrobidResponse> callback) throws ServiceException {
+    public CompletableFuture<Void> processCitation(String rawCitation, String consolidation, Consumer<GrobidResponse> callback) {
         String formBody = "citations=" + URLEncoder.encode(rawCitation, StandardCharsets.UTF_8)
                 + "&consolidateCitation=" + URLEncoder.encode(consolidation, StandardCharsets.UTF_8);

         HttpRequest request = HttpRequest.newBuilder(URI.create(grobidPath + "/processCitation"))
                 .timeout(REQUEST_TIMEOUT)
                 .header("Content-Type", "application/x-www-form-urlencoded")
                 .POST(HttpRequest.BodyPublishers.ofString(formBody, StandardCharsets.UTF_8))
                 .build();

-        httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream())
+        return httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream())
                 .thenAccept(response -> {
                     // ... existing logic
                 })
                 .exceptionally(ex -> {
                     LOGGER.warn("Async GROBID call failed", ex);
-                    throw new ServiceException(502, "Async GROBID call failed", ex);
+                    return null; // Or rethrow to propagate to caller
                 });
     }
🔧 Option 2: Add error callback
-    public void processCitation(String rawCitation, String consolidation, Consumer<GrobidResponse> callback) throws ServiceException {
+    public void processCitation(String rawCitation, String consolidation, 
+                                Consumer<GrobidResponse> onSuccess, 
+                                Consumer<Throwable> onError) {
         // ... build request ...
         
         httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream())
                 .thenAccept(response -> {
                     // ... call onSuccess ...
                 })
                 .exceptionally(ex -> {
                     LOGGER.warn("Async GROBID call failed", ex);
+                    onError.accept(ex);
                     return null;
                 });
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClientAsync.java`
around lines 77 - 92, The async branch currently throws exceptions inside the
CompletableFuture chain but the method doesn't return that future, so callers
cannot observe failures; change the method in GrobidClientAsync to return the
CompletableFuture<Void> produced by the
httpClient.sendAsync(...).thenAccept(...).exceptionally(...) chain (i.e., update
the method signature to return CompletableFuture<Void> and return the result of
the httpClient.sendAsync call) so that any ServiceException or
CompletionException propagates to callers for handling; ensure you reference the
httpClient.sendAsync(...) -> thenAccept(response -> { ... }) ->
exceptionally(...) chain when making the change.

Comment on lines +94 to +103
private GrobidResponse parseGrobidResponse(InputStream body) throws ServiceException {
try {
XMLStreamReader2 reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
GrobidResponseStaxHandler handler = new GrobidResponseStaxHandler();
StaxUtils.traverse(reader, handler);
return handler.getResponse();
} catch (XMLStreamException e) {
throw new ServiceException(502, "Cannot parse the response from GROBID", e);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

XMLStreamReader is not closed after parsing.

Same issue as in GrobidClient.java: the XMLStreamReader2 is never explicitly closed after traversal.

🛡️ Proposed fix to close the reader
 private GrobidResponse parseGrobidResponse(InputStream body) throws ServiceException {
+    XMLStreamReader2 reader = null;
     try {
-        XMLStreamReader2 reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
+        reader = (XMLStreamReader2) inputFactory.createXMLStreamReader(body);
         GrobidResponseStaxHandler handler = new GrobidResponseStaxHandler();
         StaxUtils.traverse(reader, handler);
         return handler.getResponse();
     } catch (XMLStreamException e) {
         throw new ServiceException(502, "Cannot parse the response from GROBID", e);
+    } finally {
+        if (reader != null) {
+            try {
+                reader.close();
+            } catch (XMLStreamException ignored) {
+                // Best-effort cleanup
+            }
+        }
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/scienceminer/glutton/utils/grobid/GrobidClientAsync.java`
around lines 94 - 103, In parseGrobidResponse, the XMLStreamReader2 created by
inputFactory.createXMLStreamReader(body) is never closed; update
parseGrobidResponse to ensure the XMLStreamReader2 (reader) is closed in a
finally block (or equivalent) after StaxUtils.traverse completes — e.g., assign
the reader to a variable before traversing, call handler.getResponse(), and in
finally call reader.close() (catching any XMLStreamException from close and
wrapping or logging as a ServiceException) so resources are always released;
reference symbols: parseGrobidResponse, XMLStreamReader2 reader,
inputFactory.createXMLStreamReader, StaxUtils.traverse,
GrobidResponseStaxHandler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant