Table Comparing Syntax for Different Compute APIs

Term	CUDA	HIP	HC	C++AMP	OpenCL
Device	`int deviceId`	`int deviceId`	`hc::accelerator`	`concurrency::` `accelerator`	`cl_device`
Queue	`cudaStream_t`	`hipStream_t`	`hc::` `accelerator_view`	`concurrency::` `accelerator_view`	`cl_command_queue`
Event	`cudaEvent_t`	`hipEvent_t`	`hc::` `completion_future`	`concurrency::` `completion_future`	`cl_event`
Memory	`void *`	`void *`	`void *`; `hc::array`; `hc::array_view`	`concurrency::array`; `concurrency::array_view`	`cl_mem`

	grid	grid	extent	extent	NDRange
	block	block	tile	tile	work-group
	thread	thread	thread	thread	work-item
	warp	warp	wavefront	N/A	sub-group

Thread- index	threadIdx.x	hipThreadIdx_x	t_idx.local[0]	t_idx.local[0]	get_local_id(0)
Block- index	blockIdx.x	hipBlockIdx_x	t_idx.tile[0]	t_idx.tile[0]	get_group_id(0)
Block- dim	blockDim.x	hipBlockDim_x	t_ext.tile_dim[0]	t_idx.tile_dim0	get_local_size(0)
Grid-dim	gridDim.x	hipGridDim_x	t_ext[0]	t_ext[0]	get_global_size(0)

Device Kernel	`__global__`	`__global__`	lambda inside `hc::` `parallel_for_each` or [[hc]]	`restrict(amp)`	`__kernel`
Device Function	`__device__`	`__device__`	`[[hc]]` (detected automatically in many case)	`restrict(amp)`	Implied in device compilation
Host Function	`__host_` (default)	`__host_` (default)	`[[cpu]]` (default)	`restrict(cpu)` (default)	Implied in host compilation.
Host + Device Function	`__host__` `__device__`	`__host__` `__device__`	`[[hc]]` `[[cpu]]`	`restrict(amp,cpu)`	No equivalent
Kernel Launch	`<<< >>>`	`hipLaunchKernel`	`hc::` `parallel_for_each`	`concurrency::` `parallel_for_each`	`clEnqueueNDRangeKernel`

Global Memory	`__global__`	`__global__`	Unnecessary / Implied	Unnecessary / Implied	`__global`
Group Memory	`__shared__`	`__shared__`	`tile_static`	`tile_static`	`__local`
Constant	`__constant__`	`__constant__`	Unnecessary / Implied	Unnecessary / Implied	`__constant`

	`__syncthreads`	`__syncthreads`	`tile_static.barrier()`	`t_idx.barrier()`	`barrier(CLK_LOCAL_MEMFENCE)`
Atomic Builtins	`atomicAdd`	`atomicAdd`	`hc::atomic_fetch_add`	`concurrency::` `atomic_fetch_add`	`atomic_add`
Precise Math	`cos(f)`	`cos(f)`	`hc::` `precise_math::cos(f)`	`concurrency::` `precise_math::cos(f)`	`cos(f)`
Fast Math	`__cos(f)`	`__cos(f)`	`hc::` `fast_math::cos(f)`	`concurrency::` `fast_math::cos(f)`	`native_cos(f)`
Vector	`float4`	`float4`	`hc::` `short_vector::float4`	`concurrency::` `graphics::float_4`	`float4`

###Notes

For HC and C++AMP, assume a captured tiled_ext named "t_ext" and captured extent named "ext". These languages use captured variables to pass information to the kernel rather than using special built-in functions so the exact variable name may vary.
The indexing functions (starting with thread-index) show the terminology for a 1D grid. Some APIs use reverse order of xyz / 012 indexing for 3D grids.
HC allows tile dimensions to be specified at runtime while C++AMP requires that tile dimensions be specified at compile-time. Thus hc syntax for tile dims is t_ext.tile_dim[0] while C++AMP is t_ext.tile_dim0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table Comparing Syntax for Different Compute APIs

FilesExpand file tree

hip_terms.md

Latest commit

History

hip_terms.md

File metadata and controls

Table Comparing Syntax for Different Compute APIs