Skip to content

Commit 2908c10

Browse files
ParagEkbotepytorchmergebot
authored andcommitted
Document the default garbage_collection_threshold value and improve the organization of cuda docs (pytorch#155341)
Fixes pytorch#150917 As mentioned in the issue, I've updated the documentation of `garbage_collection_threshold`and improved the organization. Could you please review? Pull Request resolved: pytorch#155341 Approved by: https://github.com/AlannaBurke, https://github.com/ngimel
1 parent d41f62b commit 2908c10

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

docs/source/notes/cuda.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -511,6 +511,8 @@ Available options:
511511
80% of the total memory allocated to the GPU application). The algorithm prefers
512512
to free old & unused blocks first to avoid freeing blocks that are actively being
513513
reused. The threshold value should be between greater than 0.0 and less than 1.0.
514+
The default value is set at 1.0.
515+
514516
``garbage_collection_threshold`` is only meaningful with ``backend:native``.
515517
With ``backend:cudaMallocAsync``, ``garbage_collection_threshold`` is ignored.
516518
* ``expandable_segments`` (experimental, default: `False`) If set to `True`, this setting instructs
@@ -546,20 +548,20 @@ Available options:
546548
appended to the end of the segment. This process does not create as many slivers
547549
of unusable memory, so it is more likely to succeed at finding this memory.
548550

549-
`pinned_use_cuda_host_register` option is a boolean flag that determines whether to
551+
* `pinned_use_cuda_host_register` option is a boolean flag that determines whether to
550552
use the CUDA API's cudaHostRegister function for allocating pinned memory instead
551553
of the default cudaHostAlloc. When set to True, the memory is allocated using regular
552554
malloc and then pages are mapped to the memory before calling cudaHostRegister.
553555
This pre-mapping of pages helps reduce the lock time during the execution
554556
of cudaHostRegister.
555557

556-
`pinned_num_register_threads` option is only valid when pinned_use_cuda_host_register
558+
* `pinned_num_register_threads` option is only valid when pinned_use_cuda_host_register
557559
is set to True. By default, one thread is used to map the pages. This option allows
558560
using more threads to parallelize the page mapping operations to reduce the overall
559561
allocation time of pinned memory. A good value for this option is 8 based on
560562
benchmarking results.
561563

562-
`pinned_use_background_threads` option is a boolean flag to enable background thread
564+
* `pinned_use_background_threads` option is a boolean flag to enable background thread
563565
for processing events. This avoids any slow path associated with querying/processing of
564566
events in the fast allocation path. This feature is disabled by default.
565567

0 commit comments

Comments
 (0)