-
Notifications
You must be signed in to change notification settings - Fork 706
Open
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.
Description
What steps did you take and what happened:
Prerequisites:
- Contour uses a client certificate for Envoy upstream TLS.
HTTPProxyuses upstream TLSHTTPProxyuses Envoy's active health checks.
All of these must be true for the bug to happen. If removing any of them, the problem goes away.
The Problem:
The upstream service is unavailable for certain period of time:
- Scenario 1: The client gets a
503 Service Unavailable"no healthy upstream" error for 4 minutes after Envoy restarts- This is counted from the moment Envoy is back up and ready with all configuration from Contour. It doesn't include the time Envoy takes to be ready serve requests.
- The downtime lasts for
healthyThresholdCount*no_traffic_interval(for example: 4 * 60s = 4min). - The
no_traffic_intervaldefaults to 60 seconds in Envoy and is not configurable in Contour.
- Scenario 2: The client gets a
503 Service Unavailable"no healthy upstream" error for several seconds after rotating the Envoy client certificate.- The downtime lasts for
healthyThresholdCount*intervalSecondsseconds (for example: 4 * 5s = 20s).
- The downtime lasts for
What did you expect to happen:
There should be no service interruption.
Anything else you would like to add:
I've added the steps to reproduce this in the comments below.
Environment:
- Contour version:
- Kubernetes version: (use
kubectl version): - Kubernetes installer & version:
- Cloud provider or hardware configuration:
- OS (e.g. from
/etc/os-release):
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.