Skip to content

Commit a34497f

Browse files
committed
docs: Add free page hinting documentation
Update docs to include details about the new free page hinting feature in firecracker. Signed-off-by: Jack Thomson <[email protected]>
1 parent a02bc71 commit a34497f

File tree

1 file changed

+125
-4
lines changed

1 file changed

+125
-4
lines changed

docs/ballooning.md

Lines changed: 125 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,10 +44,12 @@ The device has two optional features which can be enabled with the following
4444
options:
4545

4646
- `free_page_reporting`: A mechanism for the guest to continually report ranges
47-
of memory which the guest is not using and can be reclaimed. Read more here
48-
- `free_page_hinting`: A dev-preview feature which is a different mechanism to
49-
reclaim memory from the guest, this is instead triggered from the host. Read
50-
more here.
47+
of memory which the guest is not using and can be reclaimed.
48+
[Read more here](#virtio-balloon-free-page-reporting)
49+
- [(Developer Preview)](../docs/RELEASE_POLICY.md#developer-preview-features)
50+
`free_page_hinting`: A mechanism to reclaim memory from the guest, this is
51+
instead triggered from the host.
52+
[Read more here](#virtio-balloon-free-page-hinting)
5153

5254
## Security disclaimer
5355

@@ -314,6 +316,125 @@ the `page_reporting_order` module parameter in the guest kernel. The page order
314316
comes with trade-offs between performance and memory reclaimed; a good target is
315317
to have the reported ranges match the backing page size.
316318

319+
## Virtio balloon free page hinting
320+
321+
Free page hinting is a
322+
[developer-preview](../docs/RELEASE_POLICY.md#developer-preview-features)
323+
feature, which allows the guest driver to report ranges of memory which are not
324+
being used. In Firecracker, the balloon device will `madvise` the range with the
325+
`MADV_DONTNEED` flag, reducing the RSS of the guest. Free page hinting differs
326+
from reporting as this is instead initiated from the host side, giving more
327+
flexibility on when to reclaim memory.
328+
329+
To enable free page hinting when creating the balloon device, the
330+
`free_page_hinting` attribute should be set in the JSON object.
331+
332+
An example of how to configure the device to enable free page hinting:
333+
334+
```console
335+
socket_location=...
336+
amount_mib=...
337+
deflate_on_oom=...
338+
polling_interval=...
339+
340+
curl --unix-socket $socket_location -i \
341+
-X PUT 'http://localhost/balloon' \
342+
-H 'Accept: application/json' \
343+
-H 'Content-Type: application/json' \
344+
-d "{
345+
\"amount_mib\": $amount_mib, \
346+
\"deflate_on_oom\": $deflate_on_oom, \
347+
\"stats_polling_interval_s\": $polling_interval, \
348+
\"free_page_hinting\": true \
349+
}"
350+
```
351+
352+
Free page hinting is initiated and managed by Firecracker, the core mechanism to
353+
control the run is with the `cmd_id` when Firecracker sets the `cmd_id` to a new
354+
number, the driver will acknowledge this and start reporting ranges, which
355+
Firecracker will free. Once the device has reported all the ranges it can find,
356+
it will update the `cmd_id` to reflect this. The device will then hold these
357+
ranges until Firecracker sends the stop command which allows the guest driver to
358+
reclaim the memory. The time required for the guest to complete a hinting run is
359+
dependant on a multitude of different factors and is mostly dictated by the
360+
guest, however, in testing the average time is ~0.2 seconds for a 1GB VM.
361+
362+
This control mechanism in Firecracker is managed through three separate
363+
endpoints `/balloon/hinting/start`, `/balloon/hinting/status` and
364+
`/balloon/hinting/stop`. For simple operation, call the start endpoint with
365+
`acknowledge_on_stop = true`, which will automatically send the stop command
366+
once the driver has finished.
367+
368+
An example of sending this command:
369+
370+
```console
371+
curl --unix-socket $socket_location -i \
372+
-X POST 'http://localhost/balloon/hinting/start' \
373+
-H 'Accept: application/json' \
374+
-H 'Content-Type: application/json' \
375+
-d "{
376+
\"acknowledge_on_stop\": true \
377+
}"
378+
```
379+
380+
For fine-grained control, using `acknowledge_on_stop = false`, Firecracker will
381+
not send the acknowledge message. This can be used to get the guest to hold onto
382+
more memory. Using the `/status` endpoint, you can get information about the
383+
last `cmd_id` sent by Firecracker and the last update from the guest.
384+
385+
An example of the status request and response:
386+
387+
```console
388+
curl --unix-socket $socket_location -i \
389+
-X GET 'http://localhost/balloon/hinting/status' \
390+
-H 'Accept: application/json' \
391+
-H 'Content-Type: application/json'
392+
```
393+
394+
Response:
395+
396+
```json
397+
{
398+
"host_cmd": 1,
399+
"guest_cmd": 2
400+
}
401+
```
402+
403+
An example of the stop endpoint:
404+
405+
```console
406+
curl --unix-socket $socket_location -i \
407+
-X POST 'http://localhost/balloon/hinting/stop' \
408+
-H 'Accept: application/json' \
409+
-H 'Content-Type: application/json' \
410+
-d "{}"
411+
```
412+
413+
On snapshot restore, the `cmd_id` is **always** set to the stop `cmd_id` to
414+
allow the guest to reclaim the memory. If you have a particular use-case which
415+
requires this not to be the case, please raise an issue with a description of
416+
your scenario.
417+
418+
> [!WARNING]
419+
>
420+
> Free page hinting was primarily designed for live migration, because of this
421+
> there is a caveat to the device spec which means the guest is able to reclaim
422+
> memory before Firecracker even receives the range to free. This can lead to a
423+
> scenario where the device frees memory that has been reclaimed in the guest,
424+
> potentially corrupting memory. The chances of this race happening are low, but
425+
> not impossible; hence the developer-preview status.
426+
>
427+
> We are currently working with the kernel community on a feature that will
428+
> eliminate this race. Once this has been resolved, we will update the device.
429+
>
430+
> One way to safely use this feature when using UFFD is:
431+
>
432+
> 1. Enable `WRITEPROTECT` on the VM memory before starting a hinting run.
433+
> 1. Track ranges that are written to.
434+
> 1. Skip these ranges when Firecracker reports them for freeing.
435+
>
436+
> This will prevent ranges which have been reclaimed from being freed.
437+
317438
## Balloon Caveats
318439

319440
- Firecracker has no control over the speed of inflation or deflation; this is

0 commit comments

Comments
 (0)