Skip to content

Commit d598dcb

Browse files
committed
pluto: Add documentation to pluto/README.md
1 parent 4aa9df9 commit d598dcb

File tree

1 file changed

+257
-1
lines changed

1 file changed

+257
-1
lines changed

pluto/README.md

Lines changed: 257 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,262 @@ What is it?
1010
===========
1111

1212
Pluto contains high-level abstractions for memory resource management and offloading data.
13-
It relies on the low-level hic (HIP/CUDA abstraction) library to support GPUs.
13+
The memory resource management is based on and compatible with C++17 `std::pmr::memory_resource` and `std::pmr::polymorphic_allocator`,
14+
and is extended with asynchronous (de)allocation methods.
15+
16+
GPU specific memory resources are available, delegating to the low-level hic (HIP/CUDA abstraction) library to support GPUs.
17+
18+
Pluto can be used and configured both from C++ and Fortran.
19+
20+
The concepts
21+
============
22+
23+
### pluto::memory_resource
24+
25+
The `pluto::memory_resource` abstract class is an alias for
26+
[`std::pmr::memory_resource`](https://en.cppreference.com/w/cpp/memory/memory_resource)
27+
and provides following noteworthy member functions:
28+
```c++
29+
void* allocate(std::size_t bytes, std::size_t alignment) {
30+
return do_allocate(bytes, alignment);
31+
}
32+
void deallocate(void* ptr, std::size_t bytes, std::size_t alignment) {
33+
do_deallocate(ptr, bytes, alignment);
34+
}
35+
```
36+
Concrete implementations deriving from `pluto::memory_resource` must implement these functions:
37+
```c++
38+
void* do_allocate(std::size_t bytes, std::size_t alignment) override;
39+
void do_deallocate(void* ptr, std::size_t bytes, std::size_t alignment) override;
40+
```
41+
42+
Pluto provides 8 predefined concrete implementations:
43+
| memory_resource | memory_pool_resource |
44+
|-------------------------|------------------------------|
45+
| pluto::host_resource | pluto::host_pool_resource |
46+
| pluto::device_resource | pluto::device_pool_resource |
47+
| pluto::pinned_resource | pluto::pinned_pool_resource |
48+
| pluto::managed_resource | pluto::managed_pool_resource |
49+
50+
These predefined memory resources have unlimited lifetime and have memory tracking and tracing capability. \
51+
See [Predefined pluto::memory\_resources](#predefined-plutomemory_resources) below for details on each memory resource.
52+
53+
For convenience, pluto also provides aliases to two predefined standard library `std::pmr::memory_resources`:
54+
- pluto::new_delete_resource -> [std::pmr::new_delete_resource](https://en.cppreference.com/w/cpp/memory/new_delete_resource)
55+
- pluto::null_memory_resource -> [std::pmr::null_memory_resource](https://en.cppreference.com/w/cpp/memory/null_memory_resource)
56+
57+
#### Example:
58+
```C++
59+
double* data;
60+
std::size_t bytes = 10 * sizeof(double);
61+
std::size_t alignment = 64;
62+
pluto::memory_resource* mr = pluto::host_resource();
63+
double* data = (double*) mr->allocate(bytes, alignment);
64+
mr->deallocate(data, bytes, alignment);
65+
```
66+
67+
### pluto::async_memory_resource
68+
69+
The `pluto::async_memory_resource` extends `pluto::memory_resource` with asynchronous allocation and deallocation features. The asynchronous argument is a `pluto::stream_view`, which implements a `cudaStream` or `hipStream`. \
70+
The extra member functions are:
71+
```c++
72+
void* allocate_async(std::size_t bytes, std::size_t alignment, pluto::stream_view stream) {
73+
return do_allocate_async(bytes, alignment, stream);
74+
}
75+
void deallocate_async(void* ptr, std::size_t bytes, std::size_t alignment, pluto::stream_view stream) {
76+
do_deallocate_async(ptr, bytes, alignment, stream);
77+
}
78+
```
79+
Concrete implementations deriving from `pluto::async_memory_resource` then further implement these functions:
80+
```c++
81+
void* do_allocate_async(std::size_t bytes, std::size_t alignment, pluto::stream_view stream) override;
82+
void do_deallocate_async(void* ptr, std::size_t bytes, std::size_t alignment, pluto::stream_view stream) override;
83+
```
84+
85+
### pluto::allocator
86+
87+
The `pluto::allocator<T>` extends [`std::pmr::polymorphic_allocater`](https://en.cppreference.com/w/cpp/memory/polymorphic_allocator)
88+
which implements all functions required of a C++ [Allocator](https://en.cppreference.com/w/cpp/named_req/Allocator) to be given to [AllocatorAwareContainers](https://en.cppreference.com/w/cpp/named_req/AllocatorAwareContainer) such as e.g. `std::vector`, `std::map`, `std::set`, `std::list`, `std::string`.
89+
It internally uses a `pluto::memory_resource*` for allocation and deallocation.
90+
The noteworthy functions are:
91+
```c++
92+
/// Constructor without arguments; a configurable default pluto::memory_resource will be used
93+
/// This default can be set with `std::pmr::set_default_resource()` or `pluto::set_default_resource()`
94+
pluto::allocator<T>();
95+
96+
/// Constructor using a given memory_resource. Note this is compatible with any third-party `std::pmr::memory_resource`
97+
pluto::allocator<T>(pluto::memory_resource*);
98+
99+
/// Return an new allocated array with `size` number of elements. (Not bytes unlike pluto::memory_resource)
100+
T* allocate(std::size_t size);
101+
102+
/// Deallocate a given array with `size` number of elements. (Not bytes unlike pluto::memory_resource)
103+
void deallocate(T* ptr, std::size_t size);
104+
105+
/// Return an new allocated array with `size` number of elements. (Not bytes unlike pluto::memory_resource)
106+
T* allocate_async(std::size_t size, pluto::stream_view stream);
107+
108+
/// Deallocate a given array with `size` number of elements. (Not bytes unlike pluto::memory_resource)
109+
void deallocate_async(T* ptr, std::size_t size, pluto::stream_view stream);
110+
```
111+
112+
When `(de)allocate_async` is used with a `memory_resource` that does not derive from `pluto::async_memory_resource`, then `(de)allocate` will be used instead.
113+
The functions `(de)allocate`, `(de)allocate_async` also have overrides with a first argument `std::string_view label`, that can be used for tracing memory (de)allocations if the used concrete memory resources supports it.
114+
115+
#### Examples:
116+
117+
- Use allocator to allocate array of 10 elements using pre-defined `pluto::host_resource()` \
118+
C++:
119+
```c++
120+
pluto::allocator<double> alloc(pluto::host_resource());
121+
double* data = alloc.allocate(10);
122+
alloc.deallocate(data, 10);
123+
```
124+
Fortran:
125+
```fortran
126+
real(8), pointer :: array1d(:)
127+
type(pluto_allocator) :: alloc
128+
alloc = pluto%make_allocator(pluto%host_resource())
129+
call alloc%allocate(array1d, [10])
130+
call alloc%deallocate(array1d)
131+
```
132+
- Use memory pool in allocator-aware type using pre-defined `pluto::host_pool_resource()`
133+
```c++
134+
std::vector<double, pluto::allocator<double>> vector(pluto::host_pool_resource());
135+
vector.resize(10);
136+
```
137+
- The latter can be done via the `std::pmr::vector` as well due to the `std::pmr::memory_resource` compatibility:
138+
```c++
139+
std::pmr::vector<double> vector(pluto::host_pool_resource());
140+
vector.resize(10);
141+
```
142+
- We don't need to explicitely add the `memory_resource` in the `std::pmr::vector` constructor,
143+
when setting the default beforehand:
144+
```c++
145+
std::pmr::set_default_resource(pluto::host_pool_resource());
146+
std::pmr::vector<double> vector;
147+
vector.resize(10);
148+
```
149+
150+
### pluto::{host,device} namespace
151+
152+
In namespaces `pluto::host` and `pluto::device`, pluto manages defaults per memory space, independently from `std::pmr::{get,set}_default_resource()`.
153+
154+
Following functions exist for C++
155+
```c++
156+
pluto::host::set_default_resource(pluto::memory_resource*);
157+
pluto::host::get_default_resource() -> pluto::memory_resource*;
158+
pluto::device::set_default_resource(pluto::memory_resource*);
159+
pluto::device::get_default_resource() -> pluto::memory_resource*;
160+
```
161+
Following routines exist for Fortran (pseudocode):
162+
```
163+
pluto%host%set_default_resource( type(pluto_memory_resource) )
164+
pluto%host%get_default_resource() -> type(pluto_memory_resource)
165+
pluto%device%set_default_resource( type(pluto_memory_resource) )
166+
pluto%device%get_default_resource() -> type(pluto_memory_resource)
167+
```
168+
169+
Following C++ classes exist that extend `pluto::allocator<T>`:
170+
```c++
171+
pluto::host::allocator<T>
172+
pluto::device::allocator<T>
173+
```
174+
The only difference with `pluto::allocator` is that the default constructor won't use `std::pmr::get_default_resource()`, \
175+
but rather `pluto::{host,device}::get_default_resource()`.
176+
177+
In Fortran you would create allocators that use the memory space specific defaults via:
178+
```fortran
179+
type(pluto_allocator) :: host_alloc, device_alloc
180+
host_alloc = pluto%host%make_allocator()
181+
device_alloc = pluto%device%make_allocator()
182+
```
183+
184+
The initial value returned by `pluto::{host,device}::get_default_resource()` is respectively `pluto::host_resource()` and `pluto::device_resource()` unless specified otherwise via environment variables:
185+
```sh
186+
export PLUTO_HOST_MEMORY_RESOURCE=pluto::pinned_pool_resource
187+
export PLUTO_DEVICE_MEMORY_RESOURCE=pluto::device_pool_resource
188+
```
189+
190+
Predefined pluto::memory_resources
191+
----------------------------------
192+
193+
Pluto provides a number of predefined concrete `pluto::memory_resources` via accessor
194+
returning `pluto::memory_resource*` in C++ or `type(pluto_memory_resource)` in Fortran.
195+
They have C++ and Fortran accessor functions and are as well registered by name:
196+
- **pluto::new_delete_resource** \
197+
Alias to `std::pmr::new_delete_resource`, using C++ new and delete.
198+
199+
- **pluto::null_memory_resource()** \
200+
Alias to `std::pmr::null_memory_resource`, throwing exception when used.
201+
202+
- **pluto::host_resource** \
203+
Allocates host CPU memory aligned to 256 bytes.
204+
205+
- **pluto::host_pool_resource** \
206+
A memory pool based on pluto::host_resource
207+
208+
- **pluto::pinned_resource** \
209+
Allocates host-pinned (a.k.a. page-locked) CPU memory aligned to 256 bytes.
210+
211+
- **pluto::pinned_pool_resource** \
212+
A memory pool based on pluto::pinned_resource
213+
214+
- **pluto::device_resource** \
215+
A `pluto::async_memory_resource` that allocates device resident memory. \
216+
Internally this uses `cudaMalloc` or `hipMalloc` for allocate and
217+
`cudaMallocAsync` or `hipMallocAsync` for allocate_async
218+
219+
- **pluto::device_pool_resource** \
220+
A memory pool based on pluto::device_resource
221+
222+
- **pluto::managed_resource** \
223+
Allocates UVM a.k.a. managed memory accessible from both host and device.\
224+
Internally this uses `cudaMallocManaged` or `hipMallocManaged`
225+
226+
- **pluto::managed_pool_resource** \
227+
A memory pool based on pluto::managed_resource
228+
229+
## Data transfer
230+
231+
### pluto::memcpy\_{host,device}\_to\_{device,host}
232+
233+
These functions work with void* pointers and bytes arguments. \
234+
An optional `pluto::stream_view` provides async data transfers
235+
236+
### pluto::copy\_{host,device}\_to\_{device,host}
237+
238+
These functions work with templated T* pointers, and size (number of elements) arguments. \
239+
An optional `pluto::stream_view` provides async data transfers
240+
241+
242+
### memcpy\_{host,device}\_to\_{device,host}\_2D and copy\_{host,device}\_to\_{device,host}\_2D
243+
244+
Like the above but for discontiguous slices, this is useful for the atlas::MultiField
245+
246+
247+
## Tracing and tracking memory
248+
249+
The pluto predefined memory resources have tracking and tracing capability.
250+
To enable tracing, e.g. for debugging, set environment variable `PLUTO_TRACE=1`.
251+
Fine control is also possible programaticaly:
252+
```c++
253+
bool previous_status = pluto::trace::enable(true);
254+
// ... do stuff ...
255+
pluto::trace::enable(previous_status);
256+
```
257+
The trace output gets written to `pluto::trace::out` stream, which defaults to `std::cout`. This can be modified, e.g.
258+
```c++
259+
std::stringstream pluto_trace_stream;
260+
pluto::trace::set(pluto_trace_stream);
261+
```
262+
263+
A memory usage report can be obtained, which reports on the use of each of the pluto predefined memory resources.
264+
Other user-defined memory resources are not taken into consideration.
265+
```
266+
pluto::memory::report() -> std::string
267+
```
268+
269+
# Real Examples
14270

15271
See examples subdirectory on how to use Pluto.

0 commit comments

Comments
 (0)