|
1 | 1 | # Automated Test Equipment (ATE) Client |
2 | 2 |
|
3 | | -Standard ATE is used in various silicion manufacturing stages, as well as |
4 | | -for device provisioning. The ATE system generally runs on a PC system running |
5 | | -the Windows operating system. |
| 3 | +The Automated Test Equipment (ATE) client library and associated test programs |
| 4 | +are used to drive provisioning flows for OpenTitan devices. The client |
| 5 | +communicates with one or more |
| 6 | +[Provisioning Appliance (PA)](https://github.com/lowRISC/opentitan-provisioning/wiki/pa) |
| 7 | +servers to perform secure provisioning operations. |
6 | 8 |
|
7 | | -The ATE client connects to the [Provisioning Appliance](https://github.com/lowRISC/opentitan-provisioning/wiki/pa) to perform |
8 | | -provisioning operations. |
| 9 | +## Client-Side Load Balancing and Failover |
9 | 10 |
|
10 | | -## Developer Notes |
| 11 | +The ATE client library supports gRPC client-side load balancing, allowing it |
| 12 | +to distribute requests across multiple Provisioning Appliance (PA) server |
| 13 | +instances. This enhances reliability and scalability. |
11 | 14 |
|
12 | | -## Run ATE Client (Linux) |
| 15 | +### Enabling Load Balancing |
13 | 16 |
|
14 | | -Run the following steps before proceeding. |
| 17 | +To enable load balancing, you must provide a list of server addresses in a |
| 18 | +gRPC-compliant format via the `--pa_target` command-line argument when running |
| 19 | +a test program (e.g., `cp` or `ft`). |
15 | 20 |
|
16 | | -* Generate [enpoint certificates](https://github.com/lowRISC/opentitan-provisioning/wiki/auth#endpoint-certificates). |
17 | | -* Start [PA server](https://github.com/lowRISC/opentitan-provisioning/wiki/pa#start-pa-server). |
| 21 | +* **Target URI Format**: The target should be specified using gRPC's |
| 22 | + name-syntax. |
| 23 | + * For IPv4: `ipv4:<ip_addr1>:<port1>,<ip_addr2>:<port2>,...` |
| 24 | + * For IPv6: `ipv6:[<ip_addr1>]:<port1>,[<ip_addr2>]:<port2>,...` |
18 | 25 |
|
19 | | -Take note of the PA server target address and port number. In the following |
20 | | -command we start the client pointing to `localhost:5001`. |
| 26 | + Example: `--pa_target="ipv4:10.0.0.1:50051,10.0.0.2:50051"` |
| 27 | + |
| 28 | +### Load Balancing Policies |
| 29 | + |
| 30 | +You can select a load balancing policy using the `--load_balancing_policy` |
| 31 | +argument. If unspecified, gRPC's default (`pick_first`) is used. |
| 32 | + |
| 33 | +* `pick_first` (Default): The client attempts to connect to the first |
| 34 | + address in the list. All RPCs are sent to this single server. If the |
| 35 | + connection fails, it will try the next address in the list. This policy |
| 36 | + provides basic failover but does not distribute load. |
| 37 | +* `round_robin`: The client connects to all servers in the list and |
| 38 | + distributes RPCs across them in a round-robin fashion. This policy |
| 39 | + provides both load balancing and high-availability failover. |
| 40 | + |
| 41 | +### Failover Scenarios |
| 42 | + |
| 43 | +The behavior of the client during server outages depends on the configured |
| 44 | +policy. |
| 45 | + |
| 46 | +* **Partial Outage (with `round_robin`)**: If one server in the pool becomes |
| 47 | + unavailable, the gRPC runtime will automatically detect the failed |
| 48 | + connection and temporarily remove it from the pool of healthy endpoints. |
| 49 | + Subsequent API calls will be transparently routed to the remaining healthy |
| 50 | + servers. From the caller's perspective, the operations will continue to |
| 51 | + succeed without any errors. |
| 52 | + |
| 53 | +* **Total Outage**: If all server endpoints become unavailable, any API call |
| 54 | + made through the library will fail. |
| 55 | + * The C API functions (e.g., `InitSession`, `DeriveTokens`) will return a |
| 56 | + non-zero status code. This code will correspond to the gRPC status |
| 57 | + code `UNAVAILABLE` (14). |
| 58 | + * Callers must check the return value of every function call to handle |
| 59 | + this scenario gracefully. A persistent failure with this status code |
| 60 | + indicates that the client cannot reach any of the configured |
| 61 | + provisioning servers. |
| 62 | + |
| 63 | +## Client Lifecycle and Resource Management |
| 64 | + |
| 65 | +The ATE client is designed to be a long-lived object that manages the |
| 66 | +underlying gRPC channel, including all network connections and load balancing |
| 67 | +state. To ensure optimal performance and efficient resource use, follow these |
| 68 | +best practices. |
| 69 | + |
| 70 | +### Singleton Client Instance |
| 71 | + |
| 72 | +It is strongly recommended to treat the `ate_client_ptr` as a singleton within |
| 73 | +your application. You should call `CreateClient` once when your program |
| 74 | +initializes and reuse that same client instance for all subsequent gRPC calls. |
| 75 | + |
| 76 | +Repeatedly calling `CreateClient` and `DestroyClient` for different operations |
| 77 | +is an anti-pattern. Each call to `CreateClient` initializes a new gRPC channel, |
| 78 | +which involves setting up new TCP connections, performing TLS handshakes (if |
| 79 | +enabled), and resolving server addresses. This process is computationally |
| 80 | +expensive and introduces significant latency. |
| 81 | + |
| 82 | +### When to Call `DestroyClient` |
| 83 | + |
| 84 | +The `DestroyClient` function should only be called when you are certain that no |
| 85 | +more gRPC calls will be made for the remainder of the program's lifetime, |
| 86 | +typically during application shutdown. Calling `DestroyClient` will tear down |
| 87 | +all underlying network connections, and any subsequent attempt to use the |
| 88 | +client instance will result in an error. |
| 89 | + |
| 90 | +## Monitoring and Debugging |
| 91 | + |
| 92 | +While the ATE client library does not expose a direct API to query the health |
| 93 | +of individual server endpoints, it is possible to monitor the underlying gRPC |
| 94 | +channel's behavior using gRPC's built-in tracing capabilities. This is an |
| 95 | +effective method for debugging connection issues and observing the load |
| 96 | +balancer's real-time behavior. |
| 97 | + |
| 98 | +### Enabling gRPC Tracing |
| 99 | + |
| 100 | +You can enable detailed logging by setting environment variables in your shell |
| 101 | +before launching the application that uses the ATE client library. |
| 102 | + |
| 103 | +```bash |
| 104 | +# Enable tracing for connectivity state, resolvers, and load balancing |
| 105 | +export GRPC_TRACE=connectivity_state,resolver,load_balancer |
| 106 | + |
| 107 | +# Set the logging verbosity for maximum detail |
| 108 | +export GRPC_VERBOSITY=DEBUG |
| 109 | +``` |
| 110 | + |
| 111 | +### Interpreting the Output |
| 112 | + |
| 113 | +When tracing is enabled, the gRPC runtime will print detailed logs to `stderr`. |
| 114 | +If a server in the load balancing pool becomes unavailable, you will see log |
| 115 | +entries showing the subchannel's state changing from `READY` to `CONNECTING` |
| 116 | +and then to `TRANSIENT_FAILURE`. When the server becomes available again, the |
| 117 | +logs will show the state transitioning back to `READY`. |
| 118 | + |
| 119 | +This provides a definitive, real-time view of the connection health from the |
| 120 | +client's perspective and is an useful tool in active debugging sessions. |
| 121 | + |
| 122 | +## Running an ATE Test Program |
| 123 | + |
| 124 | +Before running, ensure you have: |
| 125 | +* Generated the required |
| 126 | + [endpoint certificates](https://github.com/lowRISC/opentitan-provisioning/wiki/auth#endpoint-certificates). |
| 127 | +* Started one or more |
| 128 | + [PA servers](https://github.com/lowRISC/opentitan-provisioning/wiki/pa#start-pa-server). |
| 129 | + |
| 130 | +The following example shows how to run the `ft` test program with load |
| 131 | +balancing enabled against two PA servers. |
21 | 132 |
|
22 | 133 | ```console |
23 | | -bazelisk build //src/ate:ate_main |
24 | | -bazel-bin/src/ate/ate_main \ |
25 | | - --target=localhost:5001 \ |
| 134 | +# The specific test program can be :cp or :ft |
| 135 | +bazelisk run //src/ate/test_programs:cp -- \ |
| 136 | + --pa_target="ipv4:localhost:5001,localhost:5002" \ |
| 137 | + --load_balancing_policy="round_robin" \ |
26 | 138 | --enable_mtls \ |
27 | 139 | --client_key=$(pwd)/config/certs/out/ate-client-key.pem \ |
28 | 140 | --client_cert=$(pwd)/config/certs/out/ate-client-cert.pem \ |
29 | | - --ca_root_certs=$(pwd)/config/certs/out/ca-cert.pem |
| 141 | + --ca_root_certs=$(pwd)/config/certs/out/ca-cert.pem \ |
| 142 | + --sku="sival" \ |
| 143 | + --sku_auth_pw="test_password" |
30 | 144 | ``` |
31 | 145 |
|
32 | 146 | ## Read More |
33 | 147 |
|
34 | | -* [Provisioning Appliance](https://github.com/lowRISC/opentitan-provisioning/wiki/pa) |
35 | | -* [Documentation index](https://github.com/lowRISC/opentitan-provisioning/wiki/Home) |
| 148 | +* [Provisioning Appliance](https://github.com/lowRISC/opentitan-provisioning/wiki/pa) |
| 149 | +* [Documentation index](https://github.com/lowRISC/opentitan-provisioning/wiki/Home) |
0 commit comments