Skip to content

Make MVEC Lib tolerant to dropped messages. #14

Open
Ryanbahl9 wants to merge 8 commits intomainfrom
ryan/mvec_remove_future_queue
Open

Make MVEC Lib tolerant to dropped messages. #14
Ryanbahl9 wants to merge 8 commits intomainfrom
ryan/mvec_remove_future_queue

Conversation

@Ryanbahl9
Copy link

@Ryanbahl9 Ryanbahl9 commented Feb 25, 2026

Description:

  • MVEC lib previously had a future queue that would be filled with expected responses.

The Problem:

  • If a single message got lost on the CAN bus, then the queue of futures would become de-synced compared to the actual responses arriving from the MVEC.
  • This would cause the next response to fulfill the wrong promise, and subsequent responses would cascade into the wrong callers.

The Solution:

  • Get rid of the queue in favor of a single std::optional<std::promise<T>> member variable per response type.
  • When the client sends out a message that expects a reply, MVEC Lib stores a promise in the corresponding member variable and returns a std::future<T> to the caller.
  • If the client tries to send out a new message of the same type while there is an unresolved promise for that message type:
    • The old promise is destroyed immediately (no timeout, no waiting)
    • The caller's old future throws broken_promise on get()
    • The new message is sent and a new promise/future pair is created
  • Callers are responsible for their own timeouts via future.wait_for()
  • Replaced single shared promises_mutex_ with per-type mutexes (query_mutex_, command_mutex_, population_mutex_) so parse and request methods for different types don't block each other

Additional Fixes:

  • Removed static from local vars in (get_last_fuse_status, get_last_relay_status, get_last_error_status): I don't think we want those to share state across different instances. (We don't really want different instances regardless).
  • Added try/catch to set_single_relay() in mvec_node.cpp to handle broken_promise from concurrent service calls (matches existing pattern in set_multi_relay())
  • Updated hardware test to match new API — futures return T directly, not optional<T>
  • Fixed API calls in README examples (open() to openSocket() and setReceptionCallback to setOnReceiveCallback)

Pros for this Method:

  • Simpler than the queue. One slot per type, no sync issues from lost messages
  • For the MVEC, there is no need to queue messages. There are only 3 types of messages that expect a response:
    • Relay State Query (doesn't change state, no need to queue)
    • Population Query (doesn't change state, no need to queue)
    • Relay Command (sets every relay at once, so each command effectively overwrites the last)
  • No optional wrapping on the future. Callers get std::future<T> not std::future<std::optional<T>>
  • Per-type mutexes allow a query, command, and population request to all be in-flight simultaneously

Cons for this Method:

  • Callers need to handle broken_promise exceptions if they send a new request of the same type while one is pending
  • Can't queue messages. It is up to the client/driver to handle sending multiple messages immediately after each other

Tests:

  • Added 17 new test cases (mvec_relay_socketcan.cpp) covering: basic flow, abandon-and-resend, mismatched response, no-promise response drop, timeout, cross-type isolation, concurrent threading safety

auto query_frame = relay_impl_.getRelayQueryMessage();
std::lock_guard<std::mutex> lock(query_mutex_);

// If a request is already in-flight, check if it has timed out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logic necessary? Do we actually care to reject or would we rather just resend? The other promise would just time out because we resent and the new one would be used.

But since we only have 1 request in flight at a time, that should not matter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree after coming back to this. I've simplified the logic just to abandon an existing query if a new query of the same type comes before the first one receive a response. With a single threaded client, this should only happen if the CAN message is lost.


// Create a new promise and get its future
std::promise<MvecRelayQueryReply> promise;
std::promise<std::optional<MvecRelayQueryReply>> promise;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why promise optional? That seems redundant?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optional is removed

/// @brief Mutex protecting promise queues for thread safety
std::mutex promises_mutex_;
/// @brief Single-slot promise for population query response
std::optional<std::promise<std::optional<MvecPopulationReply>>> population_reply_promise_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you need optional here! Just don't fulfill the promise, the future will take care of not having a value!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made this change as well

@Ryanbahl9 Ryanbahl9 force-pushed the ryan/mvec_remove_future_queue branch from 5a97d6e to 6ac8c32 Compare March 17, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants