Skip to content

Proxy Socket Reconnection & NM REST Timeouts #226

@DavidEdell

Description

@DavidEdell

If the refdm socket connection in proxy_cli.c is not available, cace_amp_proxy_cli_real_connect() will retry the connection with delays until it is available again.

When a request to send a packet is made through the NM REST API and the connection is not available, the service pends until a connection is made. This leads to an unacceptably long delay in responding to the REST API request.

In this scenario, the API should return immediately (waiting at most for one reconnect attempt) with success or failure. Alternatively, it could return a pending status code if we want to queue the send to continue retrying in the background, though that may lead to more user-confusion than a simple failure.

The simplest solution would be to use a small timeout period in real_connect, and ensure the appropriate callers handle reconnects on a longer period as appropriate.

I haven't traced precisely how the nm_rest.c calls lead to the calls in proxy_cli.c, but another potential approach would be to explicitly specify the timeout period as an optional argument that is passed down to the real_connect function. This would need to be paired with a timed wait on the mutex_lock to ensure the REST-driven call can timeout quickly while others (ie: the discrete receive task) can still pend for longer durations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions