Skip to content

fix(uraft): stabilize tcp status queries#738

Closed
ralcolea wants to merge 1 commit intodevfrom
fix-uraft-tcp-status
Closed

fix(uraft): stabilize tcp status queries#738
ralcolea wants to merge 1 commit intodevfrom
fix-uraft-tcp-status

Conversation

@ralcolea
Copy link
Contributor

@ralcolea ralcolea commented Feb 6, 2026

Previously, the uRaft process executed blocking helper operations (e.g. fork(), poll(), read()) inside runCommand(), all on the same io_context thread that also handled TCP status accept/write operations. As a result, the status handler could be delayed under load, and short-lived clients (such as nc) could disconnect before a response was written. This made the TCP status endpoint sensitive to contention and timing, leading to sporadic query failures.

This commit restructures the uRaft status server to decouple status serving from Raft execution, eliminating the random failures observed during TCP status queries.

Key changes:

  • Introduce a dedicated io_context and thread for TCP status handling.
  • Maintain a frequently refreshed, immutable cache of the serialized status response.
  • Serve incoming status connections directly from the cached snapshot instead of generating responses per connection.
  • Improve socket lifecycle handling and error reporting.

This commit fixes LS-282.

For testing the new feature, it was created test-fix-uraft-tcp-status branch in the Ansible repo.
This branch modifies the uraft_deadlock_multiple_elections scenario to validate the uRaft TCP status stability.
The new test is expected to fail for SaunaFS v5.6.0 and should pass for the branches including the fix.

Signed-off-by: Crash <crash@leil.io>

@gemini-code-assist
Copy link

Summary of Changes

Hello @ralcolea, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a stability issue in uRaft TCP status queries that arose from blocking operations on the main thread, which caused delays and sporadic failures for status clients. By introducing a dedicated thread for status serving and implementing a cached response mechanism, the system can now handle status requests efficiently and reliably. This change prevents contention with core Raft operations, ensuring the status endpoint remains responsive even under load.

Highlights

  • Dedicated Thread for Status Serving: The TCP status handling now operates on its own io_context and dedicated thread, effectively decoupling it from the main Raft execution thread to prevent contention.
  • Cached Status Responses: An immutable cache of the serialized Raft status is maintained and refreshed periodically. Incoming connections are served directly from this cache, eliminating the need for on-demand status generation for each request.
  • Improved Socket Handling and Error Reporting: Enhanced error reporting for Boost.Asio operations has been implemented, along with proper socket shutdown and close procedures. The TCP_NODELAY option is also set for status connections to optimize performance.
  • Refactored Status Data Generation: The storeData method has been refactored to use a more robust state name determination and a generic lambda for appending node-specific information, improving code readability and maintainability.
Changelog
  • src/uraft/uraftstatus.cc
    • Introduced a dedicated io_context (status_io_) and a new thread (status_thread_) for handling TCP status connections, isolating it from the main Raft execution.
    • Implemented a periodic refresh mechanism for an atomic shared pointer (cached_status_) that stores the serialized Raft status, ensuring clients receive up-to-date information quickly.
    • Modified uRaftStatusConnection to receive and write cached data, and added robust socket shutdown/close logic with detailed error logging for Boost.Asio operations.
    • Refactored the storeData method to use a switch statement for determining Raft state names and a generic lambda function for formatting and appending node-specific data, reducing code duplication.
    • Enhanced the acceptConnection method with comprehensive error handling, the ability to set the TCP_NODELAY option on accepted sockets, and logic to ensure only one outstanding accept operation at a time.
  • src/uraft/uraftstatus.h
    • Added necessary headers such as <atomic>, <cstdint>, <memory>, <thread>, and <vector> to support the new threading and caching mechanisms.
    • Updated uRaftStatusConnection and uRaftStatus classes to explicitly delete copy and move constructors/assignment operators, adhering to the Rule of Five for proper resource management.
    • Modified uRaftStatusConnection to utilize a shared pointer for status data, facilitating efficient sharing of the cached response.
    • Added new private members to uRaftStatus, including status_io_, status_work_guard_, status_thread_, status_cache_timer_, and cached_status_, to manage the dedicated status serving infrastructure.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a solid improvement for stabilizing the uRaft TCP status queries. Decoupling the status serving via a dedicated thread and caching the response is a well-thought-out architectural change that should resolve the reported flakiness. The implementation is generally of high quality, with good attention to threading, resource management, and error handling. I've identified one critical issue that would cause a compilation failure and another area where code can be simplified for better maintainability. After addressing these points, this should be an excellent contribution.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR stabilizes the uRaft TCP status query endpoint by decoupling status serving from Raft execution, preventing sporadic failures when clients disconnect before receiving responses.

Changes:

  • Introduced a dedicated io_context and thread for TCP status handling, isolating it from the main Raft operations
  • Implemented a frequently-refreshed immutable cache of status responses to serve connections instantly without per-request generation
  • Improved error handling and socket lifecycle management throughout the status server

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/uraft/uraftstatus.h Added dedicated threading infrastructure (io_context, work guard, thread), atomic status cache, cache refresh timer, and deleted copy/move operations
src/uraft/uraftstatus.cc Implemented status caching mechanism, dedicated status thread, improved error logging, refactored status generation with defensive coding, and enhanced socket lifecycle management

@ralcolea
Copy link
Contributor Author

ralcolea commented Feb 9, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the instability of TCP status queries by decoupling the status server from the main Raft logic. The introduction of a dedicated io_context and thread for status handling, along with a cached status response, is a solid architectural improvement that eliminates contention and potential blocking issues. The code quality is high, with robust resource management, improved error handling, and good use of modern C++ features. The changes make the uRaft status endpoint more reliable and performant under load. The AI-generated comment regarding redundant error_code.clear() calls is valid and has been retained.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch 2 times, most recently from 0bb3117 to e02d8c4 Compare February 9, 2026 17:15
@ralcolea ralcolea requested a review from Copilot February 9, 2026 17:15
@ralcolea
Copy link
Contributor Author

ralcolea commented Feb 9, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly refactors the uRaft status server to improve stability and responsiveness. By introducing a dedicated I/O context and thread for status handling, and implementing a frequently refreshed, immutable cache for status responses, the changes effectively decouple status serving from Raft execution. This addresses the issue of blocking operations in the main thread causing delays and sporadic query failures. The refactoring also includes improved socket lifecycle management and more robust error reporting. Overall, these are positive and well-considered architectural changes.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from e02d8c4 to 33a00b1 Compare February 9, 2026 22:11
@ralcolea ralcolea requested a review from Copilot February 9, 2026 22:13
@ralcolea
Copy link
Contributor Author

ralcolea commented Feb 9, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the instability of uRaft TCP status queries by decoupling the status server from the main Raft logic. The introduction of a dedicated io_context and thread for status handling, along with a frequently refreshed status cache, is a solid architectural improvement that should eliminate the contention and timing issues. The code is well-structured, with robust error handling for network operations and a clean separation of concerns. The suggestion to optimize status string generation aligns with our guidelines for performance-critical sections.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from 33a00b1 to f4f7712 Compare February 9, 2026 22:42
@ralcolea ralcolea requested a review from Copilot February 9, 2026 22:43
@ralcolea
Copy link
Contributor Author

ralcolea commented Feb 9, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the instability of TCP status queries by decoupling the status serving logic from the main Raft execution thread. The introduction of a dedicated io_context and thread for status handling, combined with a frequently refreshed, immutable cache for the status response, is a robust solution that eliminates the contention and timing issues of the previous implementation.

The changes are well-executed:

  • A new status_io_ context and status_thread_ correctly isolate the status server.
  • The use of std::atomic<std::shared_ptr<...>> with acquire-release semantics for cached_status_ provides an elegant and correct lock-free mechanism for sharing status data between threads.
  • Error handling for socket operations, especially for accept, is now much more comprehensive and resilient, with clear classification of errors and appropriate retry strategies (immediate, backoff, or stop).
  • Resource management is improved, with proper cleanup of threads and Asio objects in the destructor, and better encapsulation in the header files (e.g., making members private, deleting copy/move operations).
  • The implementation of storeData is made more efficient and safer by using fmt::format_to and fixing a potential integer underflow bug.

Overall, this is a high-quality change that significantly improves the stability and robustness of the uRaft status endpoint.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from f4f7712 to 1f0fa70 Compare February 10, 2026 09:21
@ralcolea ralcolea requested a review from Copilot February 10, 2026 09:21
@ralcolea
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the stability of the uRaft TCP status endpoint by decoupling status serving from the main Raft logic. The introduction of a dedicated thread, an io_context, and a cached status response is a robust solution to the previously observed random query failures. The error handling has been made much more resilient, and the overall code quality is high. My review includes a few suggestions to further enhance code clarity and resource management in the new implementation.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from 1f0fa70 to 7a085e7 Compare February 10, 2026 09:46
@ralcolea ralcolea requested a review from Copilot February 10, 2026 09:46
@ralcolea
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the stability and performance of the uRaft TCP status server. By introducing a dedicated thread and io_context for status requests, along with a frequently updated cache, it effectively decouples status serving from the main Raft logic. This prevents blocking operations in the Raft thread from affecting status queries. The error handling for socket operations is now much more robust, with detailed classification of accept errors and proper backoff strategies to prevent CPU spinning. The implementation also shows great attention to detail with modern C++ practices, such as improved resource management in destructors, efficient string formatting with fmt, and correct use of atomics for thread-safe data sharing.

I have one minor suggestion to improve code clarity in the destructor's error handling. Overall, this is an excellent and well-executed refactoring.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from 7a085e7 to 85f729c Compare February 10, 2026 14:48
@ralcolea ralcolea marked this pull request as ready for review February 10, 2026 15:28
@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from 85f729c to 95b1ecd Compare February 10, 2026 18:19
Copy link
Contributor

@lgsilva3087 lgsilva3087 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!!!

#include <fmt/format.h>

#include "uraftstatus.h"
namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider an empty line before namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 👍
Applied.

syslog(priority, "uRaftStatus: %s: %s (%d)", what, error.message().c_str(), error.value());
}

void shutdownAndClose(boost::asio::ip::tcp::socket &socket) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider shutDownAndCloseSocket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked your proposal. Updated.

}

void uRaftStatusConnection::set_data(std::shared_ptr<const std::vector<uint8_t>> data) {
data_ = std::move(data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to add this comment:

// Safe to move: the parameter is passed by value (not a reference), so the caller's
// shared_ptr is unaffected. The underlying vector data remains valid until this
// connection destroys its reference, even if the cache is refreshed with new data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice suggestion. This is not straightforward.
Thanks for pointing it out! 👍

boost::system::error_code result_error;
conn->socket().set_option(boost::asio::ip::tcp::no_delay(true), result_error);
logAsioError(LOG_WARNING, "set_option(TCP_NODELAY) failed", result_error);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding this comment:

// The atomic load increases the shared_ptr ref count, ensuring the data remains valid
// until the connection completes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! 👍
This comment is really important. It explains why it's always safe to rely on the value of cached_status_.

case AcceptAction::ImmediateRetry:
acceptConnection();
break;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to remove the empty lines, like in other code sections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied.

}

acceptConnection();
switch(classifyAcceptError(error)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format this line please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


boost::asio::deadline_timer status_cache_timer_;
boost::asio::deadline_timer retry_timer_;
std::atomic<std::shared_ptr<const std::vector<uint8_t>>> cached_status_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to document this variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion! 👍
It was added documentation for cached_status_ and other attributes related with the new change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to include in the tests an scenario with multiple concurrent connections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion 👍
I will update the test in the Ansible repo (test-fix-uraft-tcp-status) to run in multiple servers and generate more concurrent connections.

@@ -1,20 +1,100 @@
#include "common/platform.h"
#include "common/platform.h" // IWYU pragma: keep
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that copyright should be added on top of this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright is missing for some uRaft files like uraftstatus.{h,cc}, uraftcontroller.{h,cc}, and uraft.{h,cc}.
However, this change will be done in a dedicated PR.

Previously, the uRaft process executed blocking helper operations
(e.g. fork(), poll(), read()) inside runCommand(), all on the same
io_context thread that also handled TCP status accept/write operations.
As a result, the status handler could be delayed under load, and
short-lived clients (such as nc) could disconnect before a response
was written. This made the TCP status endpoint sensitive to contention
and timing, leading to sporadic query failures.

This commit restructures the uRaft status server to decouple status
serving from Raft execution, eliminating the random failures observed
during TCP status queries.

Key changes:
- Introduce a dedicated io_context and thread for TCP status handling.
- Maintain a frequently refreshed, immutable cache of the serialized
  status response.
- Serve incoming status connections directly from the cached snapshot
  instead of generating responses per connection.
- Improve socket lifecycle handling and error reporting.
- Replace boost::format() by fmt::format() in storeData() function.

This commit fixes LS-282.

Signed-off-by: Crash <crash@leil.io>
@ralcolea ralcolea force-pushed the fix-uraft-tcp-status branch from 95b1ecd to e97b110 Compare February 11, 2026 12:24
@ralcolea
Copy link
Contributor Author

After running some tests with different commands and the nc tool, I arrived to the following conclusions:

  1. The issue was not related with uRaft, but with the behavior of the nc tool used to query the uRaft status.
  2. This PR is not needed because adjusting the parameters used in the nc tool, the issue is mitigated.

The failures are always reported when using a command like: echo | nc sfsmaster 9428 to query the uRaft status.

🔥 What happens when using echo | nc

echo | nc localhost 9428

Client behavior (nc tool):

  1. Connect
  2. Send \n
  3. Close write side (FIN)
  4. Wait for server reply

Now the server (uRaftStatus):

  1. async_accept() succeeds
  2. storeData() prepares status buffer
  3. async_write() starts

Here’s the key:

Because the client has already sent FIN on its write side, the TCP state becomes:

Client:  FIN_WAIT_2
Server:  CLOSE_WAIT

If the client closes very quickly after sending the newline, the full sequence may be:

  1. Client connects
  2. Sends \n
  3. Immediately exits (nc ends fast)
  4. Kernel sends FIN
  5. Server starts async_write
  6. TCP stack may respond with RST if client already gone

When that happens:

  • async_write() may fail silently
  • Write may be interrupted
  • Client may receive partial or empty data

🎯 This is a classic TCP race

When a client:

  • sends data
  • immediately exits

the server may receive:

  • FIN
  • RST
  • or ACK+FIN

depending on timing.

If server writes after client is already gone:

  • write fails
  • kernel may discard buffer
  • connection resets

The proof is that there are commands that do not produce the original failures:

# Original command that produces failures
while sleep 0.01; do echo | nc localhost 9428 | grep state -q || echo fail $(date +%T); done

# Successful command
while sleep 0.01; do nc localhost 9428 < /dev/null | grep state -q || echo fail $(date +%T); done

# Another successful comand
while sleep 0.01; do echo -n "" | nc localhost 9428 | grep state -q || echo fail $(date +%T); done

@ralcolea ralcolea closed this Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants