🧘♂️️Lightweight fault tolerance primitives for your resilient and modern Python microservices
Hyx (/ˈhʌɪx/) is a set of well-known stability patterns that are commonly needed when you build microservice-based applications. Hyx is meant to be Hystrix (Java), resilience4j (Java) or Polly (C#) but for the Python world.
- Implements five commonly used resiliency patterns with various configurations based on advice and experience of industry leaders (e.g. AWS, Google, Netflix)
- Idiomatic Pythonic implementation based on decorators and context managers
- AsyncIO Native Implementation
- Built-in telemetry support for OpenTelemetry, Prometheus, and StatsD
- Lightweight. Readable Codebase. High Test Coverage
- Python 3.9+
- AsyncIO-powered applications (no sync support?)
Hyx can be installed from PyPi:
pip install hyx
# or via uv
uv add hyxFor telemetry support, install with the appropriate extras:
pip install hyx[otel] # OpenTelemetry
pip install hyx[prometheus] # Prometheus
pip install hyx[statsd] # StatsD| Component | Problem | Solution | Implemented? |
|---|---|---|---|
| 🔁 Retry | Failures happen sometimes, but they self-recover after a short time | Automatically retry operations on temporary failures | ✅ |
| 💾 Cache | |||
| ⚡️ Circuit Breaker | When downstream microservices become overloaded, sending even more load only makes things worse | Temporarily stop sending requests to failing microservices when error thresholds are exceeded. Then check if the pause helped them recover | ✅ |
| ⏱ Timeout | Sometimes operations take too long. We can't wait forever, and after a certain point success becomes unlikely | Bound waiting to a reasonable amount of time | ✅ |
| 🚰 Bulkhead | Without limits, some code can consume too many resources, bringing down the whole application (and upstream services) or slowing down other parts | Limit the number of concurrent calls, queue excess calls, and fail calls that exceed capacity | ✅ |
| 🏃♂️ Rate Limiter | A microservice can be called at any rate, including one that could bring it down if triggered accidentally | Limit the rate at which your system can be accessed | ✅ |
| 🤝 Fallback | Nothing guarantees that your dependencies will work. What do you do when they fail? | Degrade gracefully by providing default values or placeholders when dependencies are down | ✅ |
Inspired by Polly's Resiliency Policies
