|
| 1 | +# Etcd Cluster Health Check |
| 2 | + |
| 3 | +## Description |
| 4 | + |
| 5 | +Implement a passive health check mechanism, that when the connection/read/write fails, record it as an endpoint's failure. |
| 6 | + |
| 7 | +## Methods |
| 8 | + |
| 9 | +* [init](#init) |
| 10 | +* [report_failure](#report_failure) |
| 11 | +* [get_target_status](#get_target_status) |
| 12 | + |
| 13 | +### init |
| 14 | + |
| 15 | +`syntax: health_check, err = health_check.init(params)` |
| 16 | + |
| 17 | +Initializes the health check object, overiding default params with the given ones. In case of failures, returns `nil` and a string describing the error. |
| 18 | + |
| 19 | +### report_failure |
| 20 | + |
| 21 | +`syntax: health_check.report_failure(etcd_host)` |
| 22 | + |
| 23 | +Reports a health failure which will count against the number of occurrences required to make a target "fail". |
| 24 | + |
| 25 | +### get_target_status |
| 26 | + |
| 27 | +`syntax: healthy, err = health_check.get_target_status(etcd_host)` |
| 28 | + |
| 29 | +Get the current status of the target. |
| 30 | + |
| 31 | +## Config |
| 32 | + |
| 33 | +| name | Type | Requirement | Default | Description | |
| 34 | +| ------------ | ------- | ----------- | ------- | ------------------------------------------------------------ | |
| 35 | +| shm_name | string | required | | the declarative `lua_shared_dict` is used to store the health status of endpoints. | |
| 36 | +| fail_timeout | integer | optional | 10s | sets the time during which the specified number of unsuccessful attempts to communicate with the endpoint should happen to marker the endpoint unavailable, and also sets the period of time the endpoint will be marked unavailable. | |
| 37 | +| max_fails | integer | optional | 1 | sets the number of failed attempts that must occur during the `fail_timeout` period for the endpoint to be marked unavailable. | |
| 38 | + |
| 39 | +lua example: |
| 40 | + |
| 41 | +```lua |
| 42 | +local health_check, err = require("resty.etcd.health_check").init({ |
| 43 | + shm_name = "healthcheck_shm", |
| 44 | + fail_timeout = 10, |
| 45 | + max_fails = 1, |
| 46 | +}) |
| 47 | +``` |
| 48 | + |
| 49 | +In a `fail_timeout`, if there are `max_fails` consecutive failures, the endpoint is marked as unhealthy, the unhealthy endpoint will not be choosed to connect for a `fail_timeout` time in the future. |
| 50 | + |
| 51 | +Health check mechanism would switch endpoint only when the previously choosed endpoint is marked as unhealthy. |
| 52 | + |
| 53 | +The failure counter and health status of each etcd endpoint are shared across workers and by different etcd clients. |
| 54 | + |
| 55 | +Also note that the `fail_timeout` and `max_fails` of the health check cannot be changed once it has been created. |
| 56 | + |
| 57 | +## Synopsis |
| 58 | + |
| 59 | +```nginx |
| 60 | +http { |
| 61 | + # required declares a shared memory zone to store endpoints's health status |
| 62 | + lua_shared_dict healthcheck_shm 1m; |
| 63 | +
|
| 64 | + server { |
| 65 | + location = /healthcheck { |
| 66 | + content_by_lua_block { |
| 67 | + # the health check feature is optional, and can be enabled with the following configuration. |
| 68 | + local health_check, err = require("resty.etcd.health_check").init({ |
| 69 | + shm_name = "healthcheck_shm", |
| 70 | + fail_timeout = 10, |
| 71 | + max_fails = 1, |
| 72 | + }) |
| 73 | +
|
| 74 | + local etcd, err = require("resty.etcd").new({ |
| 75 | + protocol = "v3", |
| 76 | + http_host = { |
| 77 | + "http://127.0.0.1:12379", |
| 78 | + "http://127.0.0.1:22379", |
| 79 | + "http://127.0.0.1:32379", |
| 80 | + }, |
| 81 | + user = 'root', |
| 82 | + password = 'abc123', |
| 83 | + }) |
| 84 | + } |
| 85 | + } |
| 86 | + } |
| 87 | +} |
| 88 | +``` |
0 commit comments