CP-309998: ignore small amount of pages in other nodes by mg12 · Pull Request #6884 · xapi-project/xen-api

mg12 · 2026-02-03T17:02:55Z

This is an optimisation for the calculation of the number of nodes until we fix all the small xen internal allocations for the VM outside the main guest's memory. This synchronises the behaviour in the RRD VM.numa_nodes calculation.

rrdp-squeezed currently has a threshold of >4096 to count the number
of nodes. This is to ignore small number of internal data structures
that xen or other kernel devices may sometimes allocate for the VM
outside the node where the VM's main memory is allocated.

This is a temporary fix until we account in CP-311303 about these
small number of pages that sometimes appear out of the VM's main node.
In experiments, it's usually a single-digit number like 1. The maximum
number observed was around ~2200 pages.

Without this fix, the VM.numa_nodes calculation is different of the
one returned by rrdp-squeezed, and VM.numa_nodes is over-sensitive to
these small number of pages in other nodes.

psafont · 2026-02-04T09:22:51Z

Could you explain in the commit why 16 MiB is used? I think it's because it's the minimum allowed, but it would be good to confirm document it

mg12 · 2026-02-04T09:25:34Z

4096 is used because it's the value used in the RRD behaviour of VM.numa_nodes calculation, so that both code paths (RRD and field VM.numa_nodes return the same result).

lindig

Would agree with @psafont that this needs a comment here what this calculation does.

rrdp-squeezed currently has a threshold of >4096 to count the number of nodes. This is to ignore small number of internal data structures that xen or other kernel devices may sometimes allocate for the VM outside the node where the VM's main memory is allocated. This is a temporary fix until we account in CP-311303 about these small number of pages that sometimes appear out of the VM's main node. In experiments, it's usually a single-digit number like 1. The maximum number observed was around ~2200 pages. Without this fix, the VM.numa_nodes calculation is different of the one returned by rrdp-squeezed, and VM.numa_nodes is over-sensitive to these small number of pages in other nodes. Signed-off-by: Marcus Granado <marcus.granado@cloud.com>

mg12 · 2026-02-04T12:20:23Z

I've updated the PR description and the commit message to convey more clearly what this calculation does.

edwintorok · 2026-02-04T15:30:07Z

dss_numa_info should call XenctrlExt.DomainNuma.state to calculate the number of NUMA nodes and available memory, sharing the code with xenopsd. That way workarounds like this can be centralized, avoid code duplication and when dropping the workaround we won't reintroduce this difference.

Lets merge this workaround for now and then open another PR to share the code and reduce the tech debt being introduced here.

edwintorok · 2026-02-04T15:52:10Z

To avoid a magic constant (4096) we could ignore pages up to memory_overhead+video_memory maybe (we only claim the main memory, any overhead , the p2m pages, etc could be allocated from somewhere else if we ran out).

Try filtering out all 0 sized nodes in the VM's memory usage, then sort memory in decreasing order, and use a fold to go through them and subtract from a running main_memory_used and increment main_memory_nodes. Once main_memory_used hits 0, then from the next numa node (if any) then increment overhead_memory_nodes instead, but stop incrementing main_memory_nodes.

there is a xapi function to compute the overhead, although rrdp probably can't use that, but maybe it can link memory.ml and calculate it itself.

lindig

Still think there should be a code comment: (* see commit message *)

lindig · 2026-02-04T13:19:18Z

ocaml/libs/xenctrl-ext/xenctrlext.ml

      in
      let optimised = nodes = 1 || nodes < host_nodes in
      {optimised; nodes; memory}
    with Failure _ -> default


While you are here: is this unexpected and should be logged?

mg12 requested review from edwintorok and lindig February 3, 2026 17:02

lindig approved these changes Feb 4, 2026

View reviewed changes

mg12 force-pushed the private/marcusg/cp-309998 branch from b662272 to 464de44 Compare February 4, 2026 12:18

edwintorok approved these changes Feb 4, 2026

View reviewed changes

edwintorok added this pull request to the merge queue Feb 4, 2026

Merged via the queue into master with commit ba43e00 Feb 4, 2026
28 checks passed

lindig approved these changes Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CP-309998: ignore small amount of pages in other nodes#6884

CP-309998: ignore small amount of pages in other nodes#6884
edwintorok merged 1 commit intomasterfrom
private/marcusg/cp-309998

mg12 commented Feb 3, 2026 •

edited

Loading

Uh oh!

psafont commented Feb 4, 2026

Uh oh!

mg12 commented Feb 4, 2026

Uh oh!

lindig left a comment

Uh oh!

mg12 commented Feb 4, 2026

Uh oh!

edwintorok commented Feb 4, 2026

Uh oh!

Uh oh!

edwintorok commented Feb 4, 2026 •

edited

Loading

Uh oh!

lindig left a comment

Uh oh!

lindig Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mg12 commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

psafont commented Feb 4, 2026

Uh oh!

mg12 commented Feb 4, 2026

Uh oh!

lindig left a comment

Choose a reason for hiding this comment

Uh oh!

mg12 commented Feb 4, 2026

Uh oh!

edwintorok commented Feb 4, 2026

Uh oh!

Uh oh!

edwintorok commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lindig left a comment

Choose a reason for hiding this comment

Uh oh!

lindig Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mg12 commented Feb 3, 2026 •

edited

Loading

edwintorok commented Feb 4, 2026 •

edited

Loading