-
Notifications
You must be signed in to change notification settings - Fork 592
Description
Component(s)
service
What happened?
Describe the bug
When defining a Service in Kubernetes, the fields name, protocol, and appProtocol follow conventions used by both Kubernetes and Istio for protocol interpretation.
Istio relies on the port name prefix to determine the protocol, but the OpenTelemetry Operator currently defines the collector Service as:
spec:
ports:
- appProtocol: http
name: otlp-http
port: 4318
protocol: TCP
targetPort: 4318Here, the name is otlp-http, which Istio does not interpret as HTTP or HTTPS because it does not follow the convention.
Additionally, appProtocol is set to http, which takes precedence over the name and causes Istio to treat the traffic as HTTP, not HTTPS, even when TLS configuration is added.
This results in Istio not initiating TLS and producing errors.
Steps to reproduce
- Deploy the OpenTelemetry Collector Operator with TLS configuration enabled.
- Inspect the generated Service definition.
- Observe that the port name and appProtocol do not follow Kubernetes/Istio conventions.
- Deploy Istio and check that TLS is not initiated correctly.
What did you expect to see?
When the operator configures the port for TLS (HTTPS), both the name and appProtocol should follow conventions:
name:https-otlpappProtocol:https
This ensures compatibility with Istio and adherence to Kubernetes standards.
What did you see instead?
Istio interprets the Service as HTTP traffic, does not initiate TLS, and fails the handshake.
Collector version
0.135.0
Environment information
Environment
- Istio version: 2.6.7
- Platform: OpenShift
OpenTelemetry Collector configuration
The collector configuration has been anonymized for privacy. Relevant parts include:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
tls:
cert_file: /etc/otel/tls/tls.crt
key_file: /etc/otel/tls/tls.key
exporters:
otlp:
endpoint: https://<redacted-endpoint>:8200
headers:
Authorization: <redacted>
X-Scope-OrgID: <redacted>
tls:
insecure: false
insecure_skip_verify: true
service:
pipelines:
metrics:
receivers: [otlp, prometheus]
exporters: [otlp]
The configuration shows that TLS is correctly defined at the receiver level, but the Service definition does not reflect this in its port naming or appProtocol.Log output
Done different curls with the following outputs:
sh-5.1$ curl -v http://otel-collector.{namespace}.svc.cluster.local:4318/v1/metrics
* Connected to otel-collector.{namespace}.svc.cluster.local ({IP}) port 4318 (#0)
> GET /v1/metrics HTTP/1.1
> Host: otel-collector.{namespace}.svc.cluster.local:4318
> User-Agent: curl/7.76.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 400 Bad Request
< x-envoy-upstream-service-time: 3
< date: Thu, 13 Nov 2025 12:10:29 GMT
< server: envoy
< transfer-encoding: chunked
<
Client sent an HTTP request to an HTTPS server.
* Connection #0 to host otel-collector.{namespace}.svc.cluster.local left intact
sh-5.1$ curl -v https://otel-collector.{namespace}.svc.cluster.local:4318/v1/metrics
* Connected to otel-collector.{namespace}.svc.cluster.local ({IP}) port 4318 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* (5454) (IN), , Unknown (72):
* error:0A00010B:SSL routines::wrong version number
* Closing connection 0
curl: (35) error:0A00010B:SSL routines::wrong version number
sh-5.1$ curl -vk https://otel-collector.{namespace}.svc.cluster.local:4318/v1/metrics
* Connected to otel-collector.{namespace}.svc.cluster.local ({IP}) port 4318 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* (5454) (IN), , Unknown (72):
* error:0A00010B:SSL routines::wrong version number
* Closing connection 0
curl: (35) error:0A00010B:SSL routines::wrong version numberAdditional context
This misconfiguration can cause traffic interpretation issues in environments using Istio, leading to TLS handshake failures.
Implemented workaround:
To mitigate the issue, we defined a ServiceEntry (SE) and DestinationRule (DR) in Istio. This ensures proper TLS handling and protocol recognition even with the current operator configuration.
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
annotations:
creationTimestamp: "2025-11-14T14:48:25Z"
generation: 1
name: {name}
namespace: {namespace}
spec:
host: {otel-collector-host}
subsets:
- name: without-sidecar
trafficPolicy:
tls:
mode: DISABLE
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
creationTimestamp: "2025-11-14T14:45:13Z"
generation: 1
name: override-tls
namespace: {namespace}
spec:
hosts:
- {otel-collector-host}
ports:
- name: otlp-tls
number: 4318
protocol: TLS
resolution: DNS
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.