Skip to content

Conversation

@edwintorok
Copy link
Member

@edwintorok edwintorok commented Jan 22, 2026

Measured the actual increase in host memory usage when increasing the number of vCPUs on a VM from 1 to 64:

delta vcpu,delta memory_overhead_pages,coeff
1,264,264
3,724,241.333
7,1848,264
15,3960,264
31,8186,264.065
63,16635,264.048

Ran the test on both an AMD and Intel host and got similar results.

Currently XAPI uses 256*vcpu, which is an underestimate.

This can lead to internal errors raised by xenguest, or NOT_ENOUGH_FREE_MEMORY errors raised by xenopsd, after assert_can_boot_here has already replied yes, even when booting VMs sequentially.
It could also lead XAPI to choose the wrong host to evacuate a VM too, which could lead to RPU migration failures.

This is a pre-existing bug, affecting both the versions of Xen in XS8 and XS9.

PR to feature branch because this will need testing together with all the other NUMA changes, it may expose latent bugs elsewhere.

The new testcase will get its own PR because it is quite large.

@psafont
Copy link
Member

psafont commented Jan 22, 2026

Does this depend on the xen version, do we know if xs8 is affected as well?

@edwintorok
Copy link
Member Author

Does this depend on the xen version, do we know if xs8 is affected as well?

In theory yes, but 265 seems to work for both, we may want to backport this to the LCM branch eventually once it is all merged to master.

This is a pre-existing bug, affecting both the versions of Xen in XS8 and XS9.

Currently 265 seems to work for both XS8 and XS9, although will need to do a bit wider testing on different hardware.

@TSnake41
Copy link

TSnake41 commented Jan 22, 2026

265 (and even 256) feels like a lot of pages per vCPU (a MB); do you know where are those pages are used for ?
Worth noting that enabling features like nested virtualization would likely increase considerably this number in practice (especially when nested virtualization is actively used).

Copy link
Collaborator

@bernhardkaindl bernhardkaindl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

265 (and even 256) feels like a lot of pages per vCPU (a MB); do you know where are those pages are used for ?

Apparently for a lot of things. I looked at discussions on this in the past: Not even Xen hypervisor maintainers know the exact number, and I have found that it was said that it depends even on hardware capabilities and Xen's cmdline settings.

Off-topic for this PR, but for the per-domain overhead:

XenServer also has Xen patches that increase the overhead per domain,

So the testing done by Edwin is probably the best bet we currently appear to have now.

Worth noting that enabling features like nested virtualization would likely increase considerably this number in practice (especially when nested virtualization is actively used).

Thanks, yes, I think so too. Nested virt is a big topic that we can't just enable though as the engineering for it to be secure for production will be quite some work before an production-ready implementation is upstream. So that's unfortunately quite forward-looking. But indeed, that needs to be added to the changes to make for when nested-virt would be productized.

@edwintorok
Copy link
Member Author

edwintorok commented Jan 23, 2026

The 265 isn't very deterministic, with a newly installed Xen I now get 274 sometimes:

[WARNING]Memory overhead was underestimated by XAPI: 17920 < 18010, diff: 90.VMs required for failure: 3168
2026-01-23T10:10:49.472943674-00:00|0000000000000000]  vcpu,memory_overhead_pages,coeff,vms
[2026-01-23T10:10:49.472926622-00:00|0000000000000000]  1,268,268,9223372036854775807
[2026-01-23T10:10:49.472929769-00:00|0000000000000000]  3,797,265.667,9223372036854775807
[2026-01-23T10:10:49.472932197-00:00|0000000000000000]  7,1854,264.857,9223372036854775807
[2026-01-23T10:10:49.472936030-00:00|0000000000000000]  15,3825,255,9223372036854775807
[2026-01-23T10:10:49.472938456-00:00|0000000000000000]  31,8487,273.774,9223372036854775807
[2026-01-23T10:10:49.472942689-00:00|0000000000000000]  63,17231,273.508,3168
[2026-01-23T10:10:49.473145625-00:00|0000000000000000]  [INFO]VM memory_overhead_pages = ... + vcpu * 273.774 =~ ... + vcpu * 274
[2026-01-23T10:10:49.473191606-00:00|0000000000000000]  [WARNING]With 3168 VMs it might be possible to trigger OOM        error

So we might need a higher number, although 3168 is higher than the max supported VM/host, so in practice this won't actually cause an OOM on its own (but could in combination with other inaccuracies).

@edwintorok edwintorok marked this pull request as draft January 26, 2026 09:15
@edwintorok
Copy link
Member Author

Converted to draft, need to reevaluate with the new Xen patches applied.

@edwintorok
Copy link
Member Author

Interesting, if I update this to 274, then the test says it should be 282, going to see if it can stabilize.

@edwintorok
Copy link
Member Author

Interesting, if I update this to 274, then the test says it should be 282, going to see if it can stabilize.

Turns out the patch was completely wrong, the overhead is outside of the shadow allocation, not inside. So increasing shadow usage doesn't help to more accurately estimate overall VM memory usage.

Moved the value outside, and repeated the tests, and now I get a stable value. Although still host dependent, on one Intel host I get 294-256=38, on the other 265-256=9. I used the higher number.

@edwintorok edwintorok marked this pull request as ready for review January 27, 2026 11:08
@edwintorok edwintorok changed the title CA-423172: Xen uses ~265 pages/vCPU, not 256 CA-423172: Xen uses ~294 pages/vCPU, not 256 Jan 27, 2026
edwintorok added a commit that referenced this pull request Jan 27, 2026
The only Xen command-line related to this is `low_mem_virq_limit`, which
is 64MiB.

A new quicktest has shown that we are sometimes off by ~10MiB (between
`Host.compute_free_memory` and actual free memory as measured by a call
to Xenctrl physinfo) or more, and get failures booting VMs even after
`assert_can_boot_here` said yes. Sometimes the error messages can be
quite ugly, internal xenguest/xenopsd errors, instead of
HOST_NOT_ENOUGH_FREE_MEMORY.

After this change (together with
#6854) the new quicktest
doesn't fail anymore.

PR to feature branch because this will need testing together with all
the other NUMA changes, it may expose latent bugs elsewhere.


The new testcase will get its own PR because it is quite large.
@edwintorok edwintorok force-pushed the private/edvint/memory5 branch 2 times, most recently from 120b1a2 to 224b54f Compare January 27, 2026 11:27
Signed-off-by: Edwin Török <edwin.torok@citrix.com>
Measured the actual increase in host memory usage when increasing the number of
vCPUs on a VM from 1 to 64:

```
vcpu,memory_overhead_pages,coeff
1,264,264
2,558,279
3,776,258.667
4,1032,258
5,1350,270
6,1614,269
7,1878,268.286
8,2056,257
9,2406,267.333
10,2670,267
11,2934,266.727
12,3198,266.5
13,3462,266.308
14,3726,266.143
15,3990,266
16,4254,265.875
17,4518,265.765
18,4782,265.667
19,5046,265.579
20,5310,265.5
21,5574,265.429
22,5838,265.364
23,6102,265.304
24,6366,265.25
25,6630,265.2
26,6894,265.154
27,7158,265.111
28,7422,265.071
29,7686,265.034
30,7952,265.067
31,8216,265.032
32,8480,265
33,8744,264.97
34,9009,264.971
35,9276,265.029
36,9543,265.083
37,9810,265.135
38,10076,265.158
39,10340,265.128
40,10604,265.1
41,10869,265.098
42,11133,265.071
43,11397,265.047
44,11662,265.045
45,11925,265
46,12191,265.022
47,12454,264.979
0,30,0
1,294,294
2,558,279
3,822,274
4,1086,271.5
5,1350,270
6,1614,269
7,1878,268.286
8,2142,267.75
9,2406,267.333
10,2670,267
11,2934,266.727
12,3198,266.5
13,3462,266.308
14,3726,266.143
15,3990,266
16,4254,265.875
17,4518,265.765
18,4782,265.667
19,5046,265.579
20,5310,265.5
21,5574,265.429
22,5838,265.364
23,6102,265.304
24,6366,265.25
25,6630,265.2
26,6894,265.154
27,7158,265.111
28,7422,265.071
29,7686,265.034
30,7952,265.067
31,8216,265.032
32,8480,265
33,8744,264.97
34,9011,265.029
35,9278,265.086
36,9546,265.167
37,9811,265.162
38,10076,265.158
39,10340,265.128
40,10603,265.075
41,10869,265.098
42,11132,265.048
43,11397,265.047
44,11663,265.068
45,11925,265
46,12191,265.022
47,12456,265.021
[INFO]VM memory_overhead_pages = ... + vcpu * 294 =~ ... + vcpu * 294
```

We already allocate 256 pages/vcpu as part of shadow, so we need an extra 294-256=38 pages/vcpu.

This can lead to internal errors raised by xenguest, or NOT_ENOUGH_FREE_MEMORY
errors raised by xenopsd, after `assert_can_boot_here` has already replied yes,
even when booting VMs sequentially.
It could also lead XAPI to choose the wrong host to evacuate a VM too, which
could lead to RPU migration failures.

This is a pre-existing bug, affecting both the versions of Xen in XS8 and XS9.

Cannot allocate this from shadow, because otherwise the memory usage would
never converge (Xen doesn't allocate these from shadow).

On another host the measured overhead is less, take the maximum for now:
```
[INFO]VM memory_overhead_pages = ... + vcpu * 264.067 =~ ... + vcpu * 265
```

Also the amount of shadow memory reserved is nearly twice as much as needed,
especially that shadow is compiled out of Xen, but overestimates are OK,
and we might fix that separately.

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
@edwintorok edwintorok force-pushed the private/edvint/memory5 branch from 224b54f to ec3bd4a Compare January 27, 2026 13:59
@edwintorok edwintorok added this pull request to the merge queue Jan 27, 2026
Merged via the queue into xapi-project:feature/numa-xs9 with commit 72c7a25 Jan 27, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants