@@ -44,10 +44,12 @@ The device has two optional features which can be enabled with the following
4444options:
4545
4646- ` free_page_reporting ` : A mechanism for the guest to continually report ranges
47- of memory which the guest is not using and can be reclaimed. Read more here
48- - ` free_page_hinting ` : A dev-preview feature which is a different mechanism to
49- reclaim memory from the guest, this is instead triggered from the host. Read
50- more here.
47+ of memory which the guest is not using and can be reclaimed.
48+ [ Read more here] ( #virtio-balloon-free-page-reporting )
49+ - [ (Developer Preview)] ( ../docs/RELEASE_POLICY.md#developer-preview-features )
50+ ` free_page_hinting ` : A mechanism to reclaim memory from the guest, this is
51+ instead triggered from the host.
52+ [ Read more here] ( #virtio-balloon-free-page-hinting )
5153
5254## Security disclaimer
5355
@@ -314,6 +316,125 @@ the `page_reporting_order` module parameter in the guest kernel. The page order
314316comes with trade-offs between performance and memory reclaimed; a good target is
315317to have the reported ranges match the backing page size.
316318
319+ ## Virtio balloon free page hinting
320+
321+ Free page hinting is a
322+ [ developer-preview] ( ../docs/RELEASE_POLICY.md#developer-preview-features )
323+ feature, which allows the guest driver to report ranges of memory which are not
324+ being used. In Firecracker, the balloon device will ` madvise ` the range with the
325+ ` MADV_DONTNEED ` flag, reducing the RSS of the guest. Free page hinting differs
326+ from reporting as this is instead initiated from the host side, giving more
327+ flexibility on when to reclaim memory.
328+
329+ To enable free page hinting when creating the balloon device, the
330+ ` free_page_hinting ` attribute should be set in the JSON object.
331+
332+ An example of how to configure the device to enable free page hinting:
333+
334+ ``` console
335+ socket_location=...
336+ amount_mib=...
337+ deflate_on_oom=...
338+ polling_interval=...
339+
340+ curl --unix-socket $socket_location -i \
341+ -X PUT 'http://localhost/balloon' \
342+ -H 'Accept: application/json' \
343+ -H 'Content-Type: application/json' \
344+ -d "{
345+ \"amount_mib\": $amount_mib, \
346+ \"deflate_on_oom\": $deflate_on_oom, \
347+ \"stats_polling_interval_s\": $polling_interval, \
348+ \"free_page_hinting\": true \
349+ }"
350+ ```
351+
352+ Free page hinting is initiated and managed by Firecracker, the core mechanism to
353+ control the run is with the ` cmd_id ` when Firecracker sets the ` cmd_id ` to a new
354+ number, the driver will acknowledge this and start reporting ranges, which
355+ Firecracker will free. Once the device has reported all the ranges it can find,
356+ it will update the ` cmd_id ` to reflect this. The device will then hold these
357+ ranges until Firecracker sends the stop command which allows the guest driver to
358+ reclaim the memory. The time required for the guest to complete a hinting run is
359+ dependant on a multitude of different factors and is mostly dictated by the
360+ guest, however, in testing the average time is ~ 0.2 seconds for a 1GB VM.
361+
362+ This control mechanism in Firecracker is managed through three separate
363+ endpoints ` /balloon/hinting/start ` , ` /balloon/hinting/status ` and
364+ ` /balloon/hinting/stop ` . For simple operation, call the start endpoint with
365+ ` acknowledge_on_stop = true ` , which will automatically send the stop command
366+ once the driver has finished.
367+
368+ An example of sending this command:
369+
370+ ``` console
371+ curl --unix-socket $socket_location -i \
372+ -X POST 'http://localhost/balloon/hinting/start' \
373+ -H 'Accept: application/json' \
374+ -H 'Content-Type: application/json' \
375+ -d "{
376+ \"acknowledge_on_stop\": true \
377+ }"
378+ ```
379+
380+ For fine-grained control, using ` acknowledge_on_stop = false ` , Firecracker will
381+ not send the acknowledge message. This can be used to get the guest to hold onto
382+ more memory. Using the ` /status ` endpoint, you can get information about the
383+ last ` cmd_id ` sent by Firecracker and the last update from the guest.
384+
385+ An example of the status request and response:
386+
387+ ``` console
388+ curl --unix-socket $socket_location -i \
389+ -X GET 'http://localhost/balloon/hinting/status' \
390+ -H 'Accept: application/json' \
391+ -H 'Content-Type: application/json'
392+ ```
393+
394+ Response:
395+
396+ ``` json
397+ {
398+ "host_cmd" : 1 ,
399+ "guest_cmd" : 2
400+ }
401+ ```
402+
403+ An example of the stop endpoint:
404+
405+ ``` console
406+ curl --unix-socket $socket_location -i \
407+ -X POST 'http://localhost/balloon/hinting/stop' \
408+ -H 'Accept: application/json' \
409+ -H 'Content-Type: application/json' \
410+ -d "{}"
411+ ```
412+
413+ On snapshot restore, the ` cmd_id ` is ** always** set to the stop ` cmd_id ` to
414+ allow the guest to reclaim the memory. If you have a particular use-case which
415+ requires this not to be the case, please raise an issue with a description of
416+ your scenario.
417+
418+ > [ !WARNING]
419+ >
420+ > Free page hinting was primarily designed for live migration, because of this
421+ > there is a caveat to the device spec which means the guest is able to reclaim
422+ > memory before Firecracker even receives the range to free. This can lead to a
423+ > scenario where the device frees memory that has been reclaimed in the guest,
424+ > potentially corrupting memory. The chances of this race happening are low, but
425+ > not impossible; hence the developer-preview status.
426+ >
427+ > We are currently working with the kernel community on a feature that will
428+ > eliminate this race. Once this has been resolved, we will update the device.
429+ >
430+ > One way to safely use this feature when using UFFD is:
431+ >
432+ > 1 . Enable ` WRITEPROTECT ` on the VM memory before starting a hinting run.
433+ > 1 . Track ranges that are written to.
434+ > 1 . Skip these ranges when Firecracker reports them for freeing.
435+ >
436+ > This will prevent ranges which have been reclaimed from being freed.
437+
317438## Balloon Caveats
318439
319440- Firecracker has no control over the speed of inflation or deflation; this is
0 commit comments