Skip to content

Conversation

jonathan-buttner
Copy link
Contributor

This PR adds functionality to handle the document field as either an object or a string.

Originally we were only accepted the response format as:

{
     "id": "1983d114-a6e8-4940-b121-eb4ac3f6f703",
     "results": [
          {
               "document": {
                    "text": "Washington, D.C.  is the capital of the United States. It is a federal district."
               },
               "index": 2,
               "relevance_score": 0.98005307
          },
          {
               "document": {
                    "text": "abc."
               },
               "index": 3,
               "relevance_score": 0.27904198
          },
          {
               "document": {
                    "text": "Carson City is the capital city of the American state of Nevada."
               },
               "index": 0,
               "relevance_score": 0.10194652
          }
     ],
     "usage": {
          "total_tokens": 15
     }
}

Now we also support document as a string:

{
     "id": "1983d114-a6e8-4940-b121-eb4ac3f6f703",
     "results": [
          {
               "document": "Washington, D.C.  is the capital of the United States. It is a federal district.",
               "index": 2,
               "relevance_score": 0.98005307
          },
          {
               "document":  "abc",
               "index": 3,
               "relevance_score": 0.27904198
          },
          {
               "document": "Carson City is the capital city of the American state of Nevada.",
               "index": 0,
               "relevance_score": 0.10194652
          }
     ],
     "usage": {
          "total_tokens": 15
     }
}

@jonathan-buttner jonathan-buttner added >bug :ml Machine learning Team:ML Meta label for the ML team labels Oct 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've created a changelog YAML for you.

@jonathan-buttner jonathan-buttner added v9.2.0 v8.19.6 v9.1.6 v9.2.1 auto-backport Automatically create backport pull requests when merged labels Oct 17, 2025
@jonathan-buttner jonathan-buttner marked this pull request as ready for review October 17, 2025 15:48
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some optional formatting changes.

* @return the parsed response
* @throws IOException if there is an error parsing the response
*/
public static InferenceServiceResults fromResponse(HttpResult response) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth updating the javadoc for this method to include the new behaviour? If you do decide to update it, putting <pre> tags around the JSON parts would be helpful, to make the javadoc more readable in the IDE. Right now, when mousing over the method name to show the javadoc, the JSON is formatted all on one line, which is very difficult to read.

private record Response(List<ResultItem> results) {
@SuppressWarnings("unchecked")
public static final ConstructingObjectParser<Response, Void> PARSER = new ConstructingObjectParser<>(
Response.class.getSimpleName(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using getSimpleName() here will result in any errors during parsing referencing Response which is pretty vague if we need to use it for debugging. Using any of the other "name" methods would be too verbose, but is there some way we could include the name of the parent class? Or is that overkill?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, typically we just do the getSimpleName(). I think the error will have the stacktrace though which should get us to the code that is failing 🤔

} else {
parser.nextToken();
}
private record ResultItem(int index, float relevance_score, @Nullable Document document) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but relevance_score should probably be relevanceScore for code style consistency.

Comment on lines 109 to 118
private final String WASHINGTON_TEXT = "Washington, D.C..";
private final String CAPITAL_PUNISHMENT_TEXT =
"Capital punishment has existed in the United States since before the United States was a country. ";
private final String CARSON_CITY_TEXT = "Carson City is the capital city of the American state of Nevada.";

private final List<RankedDocsResults.RankedDoc> responseLiteralDocsWithText = List.of(
new RankedDocsResults.RankedDoc(2, 0.98005307F, WASHINGTON_TEXT),
new RankedDocsResults.RankedDoc(3, 0.27904198F, CAPITAL_PUNISHMENT_TEXT),
new RankedDocsResults.RankedDoc(0, 0.10194652F, CARSON_CITY_TEXT)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these constants be moved to the top of the class and made static? Also, if you do make them all static, for style consistency, responseLiteralDocsWithText should be in all caps.

@jonathan-buttner jonathan-buttner enabled auto-merge (squash) October 17, 2025 17:21
@jonathan-buttner jonathan-buttner merged commit a1d7b8a into elastic:main Oct 17, 2025
34 checks passed
jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Oct 17, 2025
… string or object (elastic#136751)

* Fixing rerank response parser

* Update docs/changelog/136751.yaml

* [CI] Auto commit changes from spotless

* Addressing feedback

* Updating comment about response format

---------

Co-authored-by: elasticsearchmachine <[email protected]>
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
9.2
9.1

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Oct 17, 2025
… string or object (elastic#136751)

* Fixing rerank response parser

* Update docs/changelog/136751.yaml

* [CI] Auto commit changes from spotless

* Addressing feedback

* Updating comment about response format

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Oct 17, 2025
… string or object (#136751) (#136762)

* Fixing rerank response parser

* Update docs/changelog/136751.yaml

* [CI] Auto commit changes from spotless

* Addressing feedback

* Updating comment about response format

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Oct 17, 2025
… string or object (#136751) (#136763)

* Fixing rerank response parser

* Update docs/changelog/136751.yaml

* [CI] Auto commit changes from spotless

* Addressing feedback

* Updating comment about response format

---------

Co-authored-by: elasticsearchmachine <[email protected]>
@jonathan-buttner jonathan-buttner deleted the ml-jinaai-rerank-fix branch October 17, 2025 19:15
jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Oct 17, 2025
… string or object (elastic#136751)

* Fixing rerank response parser

* Update docs/changelog/136751.yaml

* [CI] Auto commit changes from spotless

* Addressing feedback

* Updating comment about response format

---------

Co-authored-by: elasticsearchmachine <[email protected]>
(cherry picked from commit a1d7b8a)
@jonathan-buttner
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Oct 17, 2025
… string or object (#136751) (#136765)

* Fixing rerank response parser

* Update docs/changelog/136751.yaml

* [CI] Auto commit changes from spotless

* Addressing feedback

* Updating comment about response format

---------


(cherry picked from commit a1d7b8a)

Co-authored-by: elasticsearchmachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :ml Machine learning Team:ML Meta label for the ML team v8.19.6 v9.1.6 v9.2.0 v9.2.1 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants