Conversation
|
@ymmt2005 |
|
@nojima |
|
@ymmt2005 |
Is this really a problem? I guess Kubernetes Pod restarts |
ymmt2005
left a comment
There was a problem hiding this comment.
Looks mostly okay.
Please update docs/design.md to align with the new design.
src/config.hpp
Outdated
| public: | ||
| // Setup default configurations. | ||
| config(): m_vip("127.0.0.1"), m_tempdir(DEFAULT_TMPDIR) { | ||
| config(): m_vip(std::optional("127.0.0.1")), m_tempdir(DEFAULT_TMPDIR) { |
There was a problem hiding this comment.
Shouldn't we initialize m_leader_election_method here as well?
It looks inconsistent to initialize the field elsewhere.
There was a problem hiding this comment.
m_leader_election_method is initialized with the following line:
yrmcds::leader_election_method m_leader_election_method = yrmcds::leader_election_method::virtual_ip;There was a problem hiding this comment.
I know. I thought it was inconsistent.
There was a problem hiding this comment.
I will move all member initializations to the declaration statement.
I think it is easier to read when they are grouped together rather than separated in the constructor and member declarations.
| m_reactor.add_resource( | ||
| make_server_socket(g_config.vip(), port, w, true), | ||
| cybozu::reactor::EVENT_IN); | ||
| if( g_config.vip() ) { |
There was a problem hiding this comment.
I guess this is always true because m_vip is initialized to "127.0.0.1".
Am I right?
There was a problem hiding this comment.
Since m_vip is always set, I think using optional for this might be confusing and lead to bugs.
I'll leave this to you to figure out how to fix it.
There was a problem hiding this comment.
When leader_election_method is file, yrmcds actually has no virtual IPs. Therefore, I feel that the type of m_vip should be std::optional to model the situation correctly.
What is wrong is the initial value of m_vip. It should be initialized to be nullopt and should only have the user-specified value or 127.0.0.1 when in virtual_ip mode.
There was a problem hiding this comment.
↑This was wrong.
Since the initial value of leader_election_method is virtual_ip, the initial value of m_vip must be 127.0.0.1.
Therefore, we must empty m_vip when leader_election_method is changed to other than virtual_ip.
| fd = cybozu::tcp_connect(master_host.c_str(), g_config.repl_port()); | ||
| } catch( std::runtime_error& err ) { | ||
| logger::error() << "Failed to connect to the master (" << master_host << "): " << err.what(); | ||
| m_reactor.run_once(); |
There was a problem hiding this comment.
What does this intend to do? A comment would be helpful.
There was a problem hiding this comment.
It was called in the existing error handling, so I called it in the code I added.
if( fd == -1 ) {
m_reactor.run_once();
return false;
}I'm not aware of the original intent of the code, so I'm just guessing: since signal handling, etc., depends on the reactor, we should sometimes run the reactor during re-connecting to the master.
| m_reactor.add_resource( | ||
| make_server_socket(g_config.vip(), g_config.port(), w, true), | ||
| cybozu::reactor::EVENT_IN); | ||
| if( g_config.vip() ) { |
There was a problem hiding this comment.
Ditto. This is always true, IIUC.
I consider this to be critical. Consider the behavior of yrmcds running as a slave when master dies: the slave checks whether it can promote itself to master, and if it cannot promote itself to master for 5 seconds, it attempts to connect to the current master. If the connection fails, the slave returns to the loop of trying to be promoted to master again. If the pod crashes when connecting to master, all data held by the slave will be lost at that point. |
Makes sense. Thank you. |
|
I want to merge #91 first, so I'll leave this PR merge for a while. |
We are planning to deploy yrmcds to our Kubernetes cluster.
Currently yrmcds uses virtual_ip to detect whether the instance is master or not. However, virtual IPs are difficult to utilize in Kubernetes clusters. Therefore, I implemented the file-based leader election method. Strictly speaking, this PR does not implement leader election algorithm, but the way for yrmcds to receive the result of a leader election.
This PR introduces a new option
leader_election_method:When
leader_election_methodisfile, a yrmcds instance is considered to be master when a file exists on the certain path (specified bymaster_file).When
leader_election_methodisvirtual_ip, the current behavior remains.This PR also adds a new option
master_host, which is a address used by slaves to connect to the current master.Minor fixes
This PR also fixes minor issues I found during implementing this PR.
tcp_connect, yrmcds crushes without retry. This is an issue because we use dynamically created/updatedServiceresources to connect to the master node on Kubernetes cluster.