Skip to content

Commit 6586a26

Browse files
committed
docs: Include free page hinting docs
Update docs to include details about the new free page hinting feature in firecracker. Signed-off-by: Jack Thomson <[email protected]>
1 parent e1d6d83 commit 6586a26

File tree

1 file changed

+126
-4
lines changed

1 file changed

+126
-4
lines changed

docs/ballooning.md

Lines changed: 126 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,10 +44,13 @@ The device has two optional features which can be enabled with the following
4444
options:
4545

4646
- `free_page_reporting`: A mechanism for the guest to continually report ranges
47-
of memory which the guest is not using and can be reclaimed. Read more here
48-
- `free_page_hinting`: A dev-preview feature which is a different mechanism to
49-
reclaim memory from the guest, this is instead triggered from the host. Read
50-
more here.
47+
of memory which the guest is not using and can be reclaimed.
48+
[Read more here](#virtio-balloon-free-page-reporting)
49+
- `free_page_hinting`: A
50+
[developer-preview](../docs/RELEASE_POLICY.md#developer-preview-features)
51+
feature which is a different mechanism to reclaim memory from the guest, this
52+
is instead triggered from the host.
53+
[Read more here](#virtio-balloon-free-page-hinting)
5154

5255
## Security disclaimer
5356

@@ -314,6 +317,125 @@ the `page_reporting_order` module parameter in the guest kernel. The page order
314317
comes with trade-offs between performance and memory reclaimed; a good target is
315318
to have the reported ranges match the backing page size.
316319

320+
## Virtio balloon free page hinting
321+
322+
Free page hinting is a
323+
[developer-preview](../docs/RELEASE_POLICY.md#developer-preview-features)
324+
feature, which allows the guest driver to report ranges of memory which are not
325+
being used. In Firecracker, the balloon device will `madvise` the range with the
326+
`MADV_DONTNEED` flag, reducing the RSS of the guest. Free page hinting differs
327+
from reporting as this is instead initiated from the host side, giving more
328+
flexibility on when to reclaim memory.
329+
330+
To enable free page hinting when creating the balloon device, the
331+
`free_page_hinting` attribute should be set in the JSON object.
332+
333+
An example of how to configure the device to enable free page hinting:
334+
335+
```console
336+
socket_location=...
337+
amount_mib=...
338+
deflate_on_oom=...
339+
polling_interval=...
340+
341+
curl --unix-socket $socket_location -i \
342+
-X PUT 'http://localhost/balloon' \
343+
-H 'Accept: application/json' \
344+
-H 'Content-Type: application/json' \
345+
-d "{
346+
\"amount_mib\": $amount_mib, \
347+
\"deflate_on_oom\": $deflate_on_oom, \
348+
\"stats_polling_interval_s\": $polling_interval, \
349+
\"free_page_hinting\": true \
350+
}"
351+
```
352+
353+
Free page hinting is initiated and managed by Firecracker, the core mechanism to
354+
control the run is with the `cmd_id` when Firecracker sets the `cmd_id` to a new
355+
number, the driver will acknowledge this and start reporting ranges, which
356+
Firecracker will free. Once the device has reported all the ranges it can find,
357+
it will update the `cmd_id` to reflect this. The device will then hold these
358+
ranges until Firecracker sends the stop command which allows the guest driver to
359+
reclaim the memory. The time required for the guest to complete a hinting run is
360+
dependant on a multitude of different factors and is mostly dictated by the
361+
guest, however, in testing the average time is ~0.2 seconds for a 1GB VM.
362+
363+
This control mechanism in Firecracker is managed through three separate
364+
endpoints `/balloon/hinting/start`, `/balloon/hinting/status` and
365+
`/balloon/hinting/stop`. For simple operation, call the start endpoint with
366+
`acknowledge_on_stop = true`, which will automatically send the stop command
367+
once the driver has finished.
368+
369+
An example of sending this command:
370+
371+
```console
372+
curl --unix-socket $socket_location -i \
373+
-X POST 'http://localhost/balloon/hinting/start' \
374+
-H 'Accept: application/json' \
375+
-H 'Content-Type: application/json' \
376+
-d "{
377+
\"acknowledge_on_stop\": true \
378+
}"
379+
```
380+
381+
For fine-grained control, using `acknowledge_on_stop = false`, Firecracker will
382+
not send the acknowledge message. This can be used to get the guest to hold onto
383+
more memory. Using the `/status` endpoint, you can get information about the
384+
last `cmd_id` sent by Firecracker and the last update from the guest.
385+
386+
An example of the status request and response:
387+
388+
```console
389+
curl --unix-socket $socket_location -i \
390+
-X GET 'http://localhost/balloon/hinting/status' \
391+
-H 'Accept: application/json' \
392+
-H 'Content-Type: application/json'
393+
```
394+
395+
Response:
396+
397+
```json
398+
{
399+
"host_cmd": 1,
400+
"guest_cmd": 2
401+
}
402+
```
403+
404+
An example of the stop endpoint:
405+
406+
```console
407+
curl --unix-socket $socket_location -i \
408+
-X POST 'http://localhost/balloon/hinting/stop' \
409+
-H 'Accept: application/json' \
410+
-H 'Content-Type: application/json' \
411+
-d "{}"
412+
```
413+
414+
On snapshot restore, the `cmd_id` is **always** set to the stop `cmd_id` to
415+
allow the guest to reclaim the memory. If you have a particular use-case which
416+
requires this not to be the case, please raise an issue with a description of
417+
your scenario.
418+
419+
> [!WARNING]
420+
>
421+
> Free page hinting was primarily designed for live migration, because of this
422+
> there is a caveat to the device spec which means the guest is able to reclaim
423+
> memory before Firecracker even receives the range to free. This can lead to a
424+
> scenario where the device frees memory that has been reclaimed in the guest,
425+
> potentially corrupting memory. The chances of this race happening are low, but
426+
> not impossible; hence the developer-preview status.
427+
>
428+
> We are currently working with the kernel community on a feature that will
429+
> eliminate this race. Once this has been resolved, we will update the device.
430+
>
431+
> One way to safely use this feature when using UFFD is:
432+
>
433+
> 1. Enable `WRITEPROTECT` on the VM memory before starting a hinting run.
434+
> 1. Track ranges that are written to.
435+
> 1. Skip these ranges when Firecracker reports them for freeing.
436+
>
437+
> This will prevent ranges which have been reclaimed from being freed.
438+
317439
## Balloon Caveats
318440

319441
- Firecracker has no control over the speed of inflation or deflation; this is

0 commit comments

Comments
 (0)