Skip to content

Zipkin not working with Opensearch - appears to be double encoding UTF-8 #3796

@mikebars

Description

@mikebars

Describe the Bug

Zipkin fails to start when using Opensearch (but succeeds when using Elasticsearch)

Steps to Reproduce

  1. run docker compose up and wait for containers
  2. run curl --verbose http://0.0.0.0:9200 to see Elasticsearch / Opensearch information
  3. run curl --verbose http://0.0.0.0:9411/health to see Zipkin health

Elasticsearch (working)

docker-compose.yml:

services:
  elasticsearch:
    container_name: elasticsearch
    environment:
      - _JAVA_OPTIONS=-Xms512m -Xmx512m -XX:UseSVE=0
      - action.destructive_requires_name=false
      - discovery.type=single-node
      - http.host=0.0.0.0
      - transport.host=127.0.0.1
      - xpack.monitoring.collection.enabled=false
      - xpack.security.enabled=false
      - xpack.security.http.ssl.enabled=false
    healthcheck:
      interval: 5s
      retries: 10
      start_period: 10s
      test: curl --silent http://localhost:9200/_cluster/health | grep --extended-regexp '"status":"(green|yellow)"'
      timeout: 10s
    image: elastic/elasticsearch:8.17.2
    restart: on-failure
    ports:
      - "9200:9200"
      - "9300:9300"

  zipkin:
    container_name: zipkin
    depends_on:
      elasticsearch:
        condition: service_healthy
    environment:
      - ES_HOSTS=http://elasticsearch:9200
      - JAVA_OPTS=-XX:UseSVE=0
      - STORAGE_TYPE=elasticsearch
    image: openzipkin/zipkin:latest
    ports:
      - "9411:9411"
    restart: on-failure

output of curl --verbose http://0.0.0.0:9200:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 0.0.0.0:9200...
* Connected to 0.0.0.0 (127.0.0.1) port 9200 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:9200
> User-Agent: curl/7.88.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< X-elastic-product: Elasticsearch
< content-type: application/json
< content-length: 541
< 
{ [541 bytes data]

100   541  100   541    0     0  46409      0 --:--:-- --:--:-- --:--:-- 49181
* Connection #0 to host 0.0.0.0 left intact
{
  "name" : "662459531c8d",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "uw14QxYCTM2HEg5OZsnWKg",
  "version" : {
    "number" : "8.17.2",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "747663ddda3421467150de0e4301e8d4bc636b0c",
    "build_date" : "2025-02-05T22:10:57.067596412Z",
    "build_snapshot" : false,
    "lucene_version" : "9.12.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

Highlighted part of response

< content-type: application/json

output of curl --verbose http://0.0.0.0:9411/health:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 0.0.0.0:9411...
* Connected to 0.0.0.0 (127.0.0.1) port 9411 (#0)
> GET /health HTTP/1.1
> Host: 0.0.0.0:9411
> User-Agent: curl/7.88.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: application/json; charset=utf-8
< content-length: 209
< server: Armeria/1.31.3
< date: Mon, 17 Feb 2025 20:21:47 GMT
< 
{ [209 bytes data]

100   209  100   209    0     0  20727      0 --:--:-- --:--:-- --:--:-- 20900
* Connection #0 to host 0.0.0.0 left intact
{
  "status" : "UP",
  "zipkin" : {
    "status" : "UP",
    "details" : {
      "ElasticsearchStorage{initialEndpoints=http://elasticsearch:9200, index=zipkin}" : {
        "status" : "UP"
      }
    }
  }
}

Opensearch (not working)

docker-compose.yml:

services:
  opensearch:
    container_name: opensearch
    environment:
      - _JAVA_OPTIONS=-XX:UseSVE=0
      - action.destructive_requires_name=false
      - DISABLE_INSTALL_DEMO_CONFIG=true
      - DISABLE_SECURITY_PLUGIN=true
      - discovery.type=single-node
      - http.host=0.0.0.0
      - transport.host=127.0.0.1
    healthcheck:
      interval: 5s
      retries: 10
      start_period: 10s
      test: curl --silent http://localhost:9200/_cluster/health | grep --extended-regexp '"status":"(green|yellow)"'
      timeout: 10s
    image: opensearchproject/opensearch:latest
    restart: on-failure
    ports:
      - "9200:9200"
      - "9600:9600"

  zipkin:
    container_name: zipkin
    depends_on:
      opensearch:
        condition: service_healthy
    environment:
      - ES_HOSTS=http://opensearch:9200
      - JAVA_OPTS=-XX:UseSVE=0
      - STORAGE_TYPE=elasticsearch
    image: openzipkin/zipkin:latest
    ports:
      - "9411:9411"
    restart: on-failure

output of curl --verbose http://0.0.0.0:9200:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 0.0.0.0:9200...
* Connected to 0.0.0.0 (127.0.0.1) port 9200 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:9200
> User-Agent: curl/7.88.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 568
< 
{ [568 bytes data]

100   568  100   568    0     0  50818      0 --:--:-- --:--:-- --:--:-- 51636
* Connection #0 to host 0.0.0.0 left intact
{
  "name" : "bd49f6011512",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "dBYK3GZERkeEqDPB51Pghg",
  "version" : {
    "distribution" : "opensearch",
    "number" : "2.19.0",
    "build_type" : "tar",
    "build_hash" : "fd9a9d90df25bea1af2c6a85039692e815b894f5",
    "build_date" : "2025-02-05T16:13:57.130576800Z",
    "build_snapshot" : false,
    "lucene_version" : "9.12.1",
    "minimum_wire_compatibility_version" : "7.10.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}

Highlighted part of response

< content-type: application/json; charset=UTF-8

output of curl --verbose http://0.0.0.0:9411/health:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 0.0.0.0:9411...
* Connected to 0.0.0.0 (127.0.0.1) port 9411 (#0)
> GET /health HTTP/1.1
> Host: 0.0.0.0:9411
> User-Agent: curl/7.88.1
> Accept: */*
> 
< HTTP/1.1 503 Service Unavailable
< content-type: application/json; charset=utf-8
< content-length: 1055
< server: Armeria/1.31.3
< date: Mon, 17 Feb 2025 20:08:03 GMT
< 
{ [1055 bytes data]

100  1055  100  1055    0     0   104k      0 --:--:-- --:--:-- --:--:--  114k
* Connection #0 to host 0.0.0.0 left intact
{
  "status" : "DOWN",
  "zipkin" : {
    "status" : "DOWN",
    "details" : {
      "ElasticsearchStorage{initialEndpoints=http://opensearch:9200, index=zipkin}" : {
        "status" : "DOWN",
        "details" : {
          "error" : "IllegalArgumentException: .version.number not found in response: �\u0006\u0000\u0000sNaPpY\u0000�\u0001\u0000�爞�\u0004l{\n  \"name\" : \"bd49f6011512\",\u0001\u001B\u001Ccluster_\u0015#\u0018docker-\r\u00186%\u0000\fuuid\u0005HTdBYK3GZERkeEqDPB51Pghg\t-\u0018version\u0001(<{\n    \"distribut\r\u0017(\"opensearch\u00053\u0001�\u0010umber\u00014\u0018\"2.19.0\u0011\u0019 build_typ\t�\btar6\u001A\u0000\fhash\u00057�fd9a9d90df25bea1af2c6a85039692e815b894f5\"\u0001�\u0004  \rY\fdate\u0005?t2025-02-05T16:13:57.130576800Z6t\u0000\u001Csnapshot\u00019\u0010false\rS\u0018lucene_=\u0001\u0018\"9.12.1\u0011?dminimum_wire_compatibility25\u0000\f7.109\u0002\u00115\u0010indexr6\u0000\u00015\f\n  }\u0001�\u0018\"taglin\t� The OpenS%tH Project: https://o5� .org/\"\n}\n"
        }
      }
    }
  }
}

Expected Behaviour

Zipkin should work with Opensearch the way it does with Elasticsearch

Notes

Since Elasticsearch is returning a response with

< content-type: application/json

and Opensearch is returning a response with

< content-type: application/json; charset=UTF-8

I wonder if the root cause might be that the Opensearch response is being "double encoded" as UTF-8 since I see this logic here that appears to be common to both Elasticsearch and Opensearch:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions