Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/extended_api/memory_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,13 @@ It is low across threads within a block, but high across arbitrary threads in th

To account for non-uniform thread synchronization costs that are not always low, CUDA C++ extends the standard C++ memory model and concurrency facilities in the `cuda::` namespace with **thread scopes**, retaining the syntax and semantics of standard C++ by default.

## Asynchronous operations

[Asynchronous operations] - like the copy operations performed by [`memcpy_async`] - are performed _as-if_ by new _asynchronous threads_.

[Asynchronous operations]: extended_api/asynchronous_operations.md
[`memcpy_async`]: extended_api/asynchronous_operations/memcpy_async.md

## Thread Scopes

A _thread scope_ specifies the kind of threads that can synchronize with each other using synchronization primitive such as [`atomic`] or [`barrier`].
Expand Down Expand Up @@ -39,6 +46,7 @@ Each program thread is related to each other program thread by one or more threa
- Each GPU thread is related to each other GPU thread in the same CUDA device by the *device* thread scope: `thread_scope_device`.
- Each GPU thread is related to each other GPU thread in the same CUDA thread block by the *block* thread scope: `thread_scope_block`.
- Each thread is related to itself by the `thread` thread scope: `thread_scope_thread`.
- Asynchronous threads are related to the thread requesting the asynchronous operations via all scope relationships.

## Synchronization primitives

Expand Down