support numa_binds,eg:numactl --physcpubind= --membind= python -m sgl… by poryfly · Pull Request #1890 · kvcache-ai/ktransformers

poryfly · 2026-03-18T09:41:15Z

support numa_bind
eg:
numactl --physcpubind=$CPU_CORES_0 --membind=$NUMA_NODE_0 python -m sglang.launch_server...
The entire machine's resources are divided according to NUMA nodes, and multiple sets of services are deployed to fully utilize the resources

What does this PR do?

Fixes # (issue)

Before submitting

Did you read the contributor guideline?
Did you write any new necessary tests?

…ang.launch_server

gemini-code-assist · 2026-03-18T09:41:30Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for NUMA (Non-Uniform Memory Access) binding, allowing the system to respect numactl configurations for CPU buffer management. By detecting the NUMA nodes a process is allowed to access, it ensures that resources are utilized more efficiently, especially in environments where multiple services are deployed across different NUMA nodes to maximize hardware potential.

Highlights

NUMA Binding Support: Implemented logic to detect and utilize NUMA node binding specified by numactl --membind or --cpunodebind for CPU buffer management, ensuring that thread pools are mapped to the correct NUMA nodes.
New NUMA Detection Function: Added a new helper function _get_allowed_numa_nodes that uses libc.syscall to query the system's memory policy and determine the NUMA nodes accessible to the current process.
Dynamic Thread Pool NUMA Mapping: Modified the BaseMoEWrapper to dynamically set subpool_numa_map based on the detected NUMA nodes, falling back to sequential IDs if no explicit binding is active or detection fails, and issuing a warning for misconfigurations.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for NUMA bindings by detecting the allowed NUMA nodes from the process's memory policy. This is a valuable enhancement for optimizing performance on multi-NUMA systems. The implementation uses a Linux-specific syscall via ctypes to achieve this. My review focuses on improving the robustness and portability of this new functionality.

gemini-code-assist · 2026-03-18T09:42:19Z

kt-kernel/python/experts_base.py

+    # SYS_get_mempolicy: 239 on x86-64
+    SYS_get_mempolicy = 239


The syscall number for get_mempolicy is hardcoded for the x86-64 architecture. This will cause the function to fail on other architectures, such as aarch64, which uses a different syscall number. To ensure portability, you should detect the machine's architecture and use the appropriate syscall number.

# SYS_get_mempolicy: 239 on x86-64, 236 on aarch64 arch = platform.machine() if arch == "x86_64": SYS_get_mempolicy = 239 elif arch == "aarch64": SYS_get_mempolicy = 236 else: warnings.warn( f"NUMA node detection via get_mempolicy is not supported on " f"architecture '{arch}'. Falling back to sequential NUMA IDs." ) return None

gemini-code-assist · 2026-03-18T09:42:19Z

kt-kernel/python/experts_base.py

+    except Exception:
+        return None


The try...except block catches a broad Exception and silently returns None. This can hide underlying issues, making debugging difficult. It would be better to at least log a warning to inform the user that NUMA detection failed and why.

except Exception as e: warnings.warn(f"Failed to get NUMA policy via syscall: {e}. Falling back to sequential NUMA IDs.") return None

ErvinXie · 2026-03-18T14:17:47Z

Thanks for raising this issue — the use case of running multiple instances on different NUMA nodes is definitely valid.

We've taken a slightly different approach in #1891 + kvcache-ai/sglang#28: instead of auto-detecting the numactl membind policy via Linux syscalls, we expose an explicit --kt-numa-nodes CLI parameter.

Usage example — deploy two instances on a dual-NUMA machine:

# Instance 1: bind to NUMA node 0
python -m sglang.launch_server \
  --kt-threadpool-count 1 --kt-numa-nodes 0 \
  --kt-cpuinfer 48 --port 30000 ...

# Instance 2: bind to NUMA node 1
python -m sglang.launch_server \
  --kt-threadpool-count 1 --kt-numa-nodes 1 \
  --kt-cpuinfer 48 --port 30001 ...

This way you don't need numactl at all — KTransformers handles the NUMA binding internally via hwloc/libnuma, and you just specify which node(s) to use.

Why explicit over auto-detect:

No dependency on x86-64 specific syscall numbers (SYS_get_mempolicy = 239)
Works on any Linux architecture (ARM, etc.)
Consistent with KTransformers' existing configuration style (explicit params > implicit detection)
No need to wrap the process with numactl externally

Would love to hear your feedback on whether this approach covers your use case!

support numa_binds,eg:numactl --physcpubind= --membind= python -m sgl…

c0e3ad9

…ang.launch_server

gemini-code-assist bot reviewed Mar 18, 2026

View reviewed changes

ErvinXie mentioned this pull request Mar 18, 2026

[feat](kt-kernel): add --kt-numa-nodes for explicit NUMA node mapping #1891

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support numa_binds,eg:numactl --physcpubind= --membind= python -m sgl…#1890

support numa_binds,eg:numactl --physcpubind= --membind= python -m sgl…#1890
poryfly wants to merge 1 commit intokvcache-ai:mainfrom
poryfly:support_numa_binds_run

poryfly commented Mar 18, 2026

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 18, 2026

Uh oh!

gemini-code-assist bot Mar 18, 2026

Uh oh!

ErvinXie commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

poryfly commented Mar 18, 2026

What does this PR do?

Before submitting

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

ErvinXie commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants