Skip to content

Conversation

@jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Jun 26, 2025

This PR addresses the feedback item from the original PR: #127939

This PR adds support for embedding_type so users can specify the format of the text embeddings returned by the upstream service.

Supported embedding types are:

  • float (default if not included)
  • byte
  • bit (or binary)

Comments:
#127939 (comment)

Example

PUT _inference/text_embedding/test-text-embedding
{
    "service": "custom",
    "service_settings": {
        "secret_parameters": {
            "api_key": "<api key>"
        },
        "url": "https://api.cohere.com/v2/embed",
        "headers": {
            "Authorization": "bearer ${api_key}",
            "Content-Type": "application/json"
        },
        "request": "{\"texts\": ${input}, \"model\": \"embed-v4.0\", \"input_type\": ${input_type}, \"embedding_types\": [\"binary\"]}",
        "response": {
            "json_parser": {
                "text_embeddings":"$.embeddings.binary[*]",
                "embedding_type": "binary"
            }
        },
        "input_type": {
            "translation": {
                "ingest": "search_document",
                "search": "search_query"
            },
            "default": "search_document"
        }
    }
}

POST _inference/text_embedding/test-text-embedding
{
    "input": ["The quick brown fox jumps over the lazy dog", "awesome"]
}

@jonathan-buttner jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team v9.2.0 labels Jun 26, 2025
@jonathan-buttner jonathan-buttner marked this pull request as ready for review June 27, 2025 14:57
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Map<String, Object> map,
ConfigurationParseContext context,
TaskType taskType,
String inferenceId
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inferenceId wasn't being used.

SimilarityMeasure similarity = extractSimilarity(map, ModelConfigurations.SERVICE_SETTINGS, validationException);
Integer dims = removeAsType(map, DIMENSIONS, Integer.class);
Integer maxInputTokens = removeAsType(map, MAX_INPUT_TOKENS, Integer.class);
return new TextEmbeddingSettings(similarity, dims, maxInputTokens, DenseVectorFieldMapper.ElementType.FLOAT);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The element type logic has been delegated to the TextEmbeddingResponseParser so removing it from being hard coded to float here.


if (in.getTransportVersion().before(TransportVersions.ML_INFERENCE_CUSTOM_SERVICE_EMBEDDING_TYPE)
&& in.getTransportVersion().isPatchFrom(TransportVersions.ML_INFERENCE_CUSTOM_SERVICE_EMBEDDING_TYPE_8_19) == false) {
in.readOptionalEnum(DenseVectorFieldMapper.ElementType.class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For older versions, we'll read it but ignore it. It should only be float which we'll default to in the TextEmbeddingResponseParser.

}

protected abstract T transform(Map<String, Object> extractedField);
protected abstract InferenceServiceResults transform(Map<String, Object> extractedField);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TextEmbeddingResponseParser will now return different types of results depending on the embedding type so we need this to be the base results type to cover all them

public static final TextEmbeddingSettings NON_TEXT_EMBEDDING_TASK_TYPE_SETTINGS = new TextEmbeddingSettings(null, null, null, null);
public static final TextEmbeddingSettings NON_TEXT_EMBEDDING_TASK_TYPE_SETTINGS = new TextEmbeddingSettings(null, null, null);

public static TextEmbeddingSettings fromMap(Map<String, Object> map, TaskType taskType, ValidationException validationException) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never included elementType in the toXContent method, so we don't have to worry about backwards compatibility with older versions of this model, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's correct, it was just hard coded previously.

@jonathan-buttner jonathan-buttner enabled auto-merge (squash) July 8, 2025 17:57
@jonathan-buttner jonathan-buttner merged commit 172637b into elastic:main Jul 8, 2025
33 checks passed
@jonathan-buttner jonathan-buttner deleted the cs-embedding-types branch July 8, 2025 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning >non-issue Team:ML Meta label for the ML team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants