Skip to content

Raftsimple - Simple key-value store example #376

Open
ntdkhiem wants to merge 21 commits intoetcd-io:mainfrom
ntdkhiem:raftsimple
Open

Raftsimple - Simple key-value store example #376
ntdkhiem wants to merge 21 commits intoetcd-io:mainfrom
ntdkhiem:raftsimple

Conversation

@ntdkhiem
Copy link

@ntdkhiem ntdkhiem commented Jan 30, 2026

It has been over a year since the last PR for issue #2 , so I just wanted to take a crack at it for the sake of my own learning.

The goal is to provide a clean, minimal example of implementing key-value store with Raft. I have stripped down the complexity of the existing raftexample from etcd by making all nodes run in-memory, allowing the transport layer to be as simple as possible (instead of TCP) and removing complex WAL/Snapshot persistence for now to focus on the core Raft flow. Thanks to @mhagger suggestions in etcd-io/etcd#15471, the implementation became so much simpler and easier for me to follow.

At the moment, this PR will just be a Draft is now ready to be reviewed. The basic node communication is working, but I am planning to:

  • Re-implementing a basic snapshotting mechanism.
  • Adding tests.

As a first-time contributor to this project, I am very open to feedback on the structure and approach. Does this simplified design align with what you envisioned @ahrtr ?

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ntdkhiem
Once this PR has been reviewed and has the lgtm label, please assign spzala for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link

Hi @ntdkhiem. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ntdkhiem ntdkhiem force-pushed the raftsimple branch 2 times, most recently from e91e416 to aeefc80 Compare February 3, 2026 02:14
@ntdkhiem ntdkhiem marked this pull request as ready for review February 3, 2026 02:16
ntdkhiem and others added 18 commits February 2, 2026 21:21
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
- POST method doens't work at the moment.

Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
- ref: etcd-io/etcd#15471

Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
@ntdkhiem
Copy link
Author

@ahrtr I fully understand that this issue is very low in priority for you right now. If I may, what are something I can do to help facilitating your reviewing process?

  • Is there too many etcd-related context in this PR that I need to change?
  • Do I need to enhance the snapshotting functionality or any other features to make this PR a legit one?
  • Do I need to follow certain standards or methodologies?
  • Should I create a quick documentation explaining how I use raft in this PR?
  • Should I add more unit and integration test cases to test the overall program?
  • Any other things you have in mind?

@ahrtr
Copy link
Member

ahrtr commented Feb 11, 2026

Sorry, did not get time to review this recently.

Before I review it sometime later, can you please

  • ensure all the workflow green
  • show me the steps on how you verify the example locally

@ahrtr
Copy link
Member

ahrtr commented Feb 11, 2026

pls resolve the Go Vulnerability in a separate PR, once it's merged, then rebase this PR. thx

@ntdkhiem
Copy link
Author

ntdkhiem commented Feb 11, 2026

Wait...This is weird. I accidentally pushed a new commit to resolve the vulnerability before reading your latest comment. But, when I reverted the commit and forced push, it still passed the Go Vulnerability Checker / test. Should I make a separate PR anyway?

@ahrtr
Copy link
Member

ahrtr commented Feb 11, 2026

Should I make a separate PR anyway?

yes, pls

@ntdkhiem
Copy link
Author

ntdkhiem commented Feb 11, 2026

Sorry, did not get time to review this recently.

Before I review it sometime later, can you please

  • ensure all the workflow green
  • show me the steps on how you verify the example locally

Totally understandable!

The project is located in contrib/raftsimple. After you built the project, running it will spin up 3 Raft nodes by default (but you can specify the number of nodes with -nodes). Each Raft node will have a key-value store API, exposing at port 9120+nodeID (where nodeID is one-based indexing)

The API allows four methods: GET, POST, PUT, DELETE to either retrieve/create a key or add/delete a Raft node.

On a separate terminal:

curl http://127.0.0.1:9121/foo -XPUT -d bar

will add/replace the key foo with the value bar to the store.

curl http://127.0.0.1:9121/foo

will retrieve the value from key foo. Will throw Failed to GET if the key doesn't exist.

curl http://127.0.0.1:9121/4 -XPOST

will create a new Raft node and key-value store API, exposing at port 9124. Will throw Failed to POST if the target node is already existed.

curl http://127.0.0.1:9121/1 -XDELETE

will delete Raft node 1, along with its key-value store API. That means port 9121 should be no longer available.

curl http://127.0.0.1:9122/1 -XPOST

will revive Raft node 1, kicking off its snapshot reloading, along with its key-value store API at port 9121.

During executing these commands, you can monitor the main terminal to see Raft in effect.

@ntdkhiem
Copy link
Author

pls resolve the Go Vulnerability in a separate PR, once it's merged, then rebase this PR. thx

Sorry, I'm quite confused about your comment. The vulnerability is found in the Go standard library that I use in my code. A fix would be to bump up Go version from 1.25.6 to 1.25.7 in contrib/raftsimple/go.mod. If I make a PR for this, wouldn't it merge this entire project to the codebase?

Thank you in advance.

@ahrtr
Copy link
Member

ahrtr commented Feb 11, 2026

sorry, I mixed the raftexample/go.mod with the go.mod under the root directory. You need to bump your go.mod in this PR.

- Resolve GO-2026-4337

Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
- When a node get deleted and restored, it attempts to replay all
messages which will also include the one entry removing itself. When
this happens, the node is removed and the rest of the cluster panic
because it couldn't send MsgHeartBeat to that node.

Signed-off-by: Khiem Nguyen <ppkhiemnguyen@gmail.com>
@ahrtr
Copy link
Member

ahrtr commented Feb 12, 2026

I see that you reuse most of the existing source code of raftexample, which is good.

  • Please follow exactly the same usage as raftexample. e.g remove the flag --nodes. Read the readme.
  • Essentially we need to reuse raftexample, and just to replace the etcd stuff (e.g. snap, wal etc.) with a simplified implementation.
  • I don't mind multiple commits in one PR, but ensure they are well organized (try to remove some unnecessary intermediate commits)
  • add a readme for this example. I expect you should be able to reuse raftexample/README.md with minimal change.

@ntdkhiem
Copy link
Author

ntdkhiem commented Feb 17, 2026

@ahrtr , I’ve simplified the network layer to run all nodes within a single process using in-memory communication.

If following the exact usage, the raftexample runs on multiple processes, communicating via TCP. I will have to rely on rafthttp from etcd (defeats the purpose of no etcd-related context) or refactor the network layer to use TCP connection.

Am I required to do that or is there some flexibility in this layer? I thought this was supposed to only demonstrate Raft and make sure everything is as simple as possible. Please advice.

PS: Thanks for reminding me. I will clean up the git history and add README once everything is good to go.

@ntdkhiem
Copy link
Author

@ahrtr a quick nudge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments