Struct cooperative_groups
global functions from cooperative_groups.h header
Inherited Members
Namespace: Hybridizer.Runtime.CUDAImports
Assembly: Hybridizer.Runtime.CUDAImports.dll
Syntax
[IntrinsicInclude("cooperative_groups.h", Flavor = 1)]
public struct cooperative_groups
Methods
coalesced_threads()
Declaration
[IntrinsicFunction("cooperative_groups::coalesced_threads")]
public static coalesced_group coalesced_threads()
Returns
Type | Description |
---|---|
coalesced_group |
this_grid()
Constructs a grid_group
Declaration
[IntrinsicFunction("cooperative_groups::this_grid")]
public static grid_group this_grid()
Returns
Type | Description |
---|---|
grid_group |
this_thread()
Constructs a generic thread_group containing only the calling thread
Declaration
[IntrinsicFunction("cooperative_groups::this_thread")]
public static thread_group this_thread()
Returns
Type | Description |
---|---|
thread_group |
this_thread_block()
Constructs a thread_block group
Declaration
[IntrinsicFunction("cooperative_groups::this_thread_block")]
public static thread_block this_thread_block()
Returns
Type | Description |
---|---|
thread_block |
tile_partition_1(thread_block)
The tiled_partition<tilesz>(parent) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)/tilesz) subgroups will be created, therefore the parent group size must be evenly divisible by the tilesz. The allow parent groups are thread_block or thread_block_tile<size>.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to native hardware sizes, 1/2/4/8/16/32. The size(parent) must be greater than the template Size parameter otherwise the results are undefined.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition<1>")]
public static thread_block_tile_1 tile_partition_1(thread_block group)
Parameters
Type | Name | Description |
---|---|---|
thread_block | group |
Returns
Type | Description |
---|---|
thread_block_tile_1 |
tile_partition_16(thread_block)
The tiled_partition<tilesz>(parent) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)/tilesz) subgroups will be created, therefore the parent group size must be evenly divisible by the tilesz. The allow parent groups are thread_block or thread_block_tile<size>.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to native hardware sizes, 1/2/4/8/16/32. The size(parent) must be greater than the template Size parameter otherwise the results are undefined.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition<16>")]
public static thread_block_tile_16 tile_partition_16(thread_block group)
Parameters
Type | Name | Description |
---|---|---|
thread_block | group |
Returns
Type | Description |
---|---|
thread_block_tile_16 |
tile_partition_2(thread_block)
The tiled_partition<tilesz>(parent) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)/tilesz) subgroups will be created, therefore the parent group size must be evenly divisible by the tilesz. The allow parent groups are thread_block or thread_block_tile<size>.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to native hardware sizes, 1/2/4/8/16/32. The size(parent) must be greater than the template Size parameter otherwise the results are undefined.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition<2>")]
public static thread_block_tile_2 tile_partition_2(thread_block group)
Parameters
Type | Name | Description |
---|---|---|
thread_block | group |
Returns
Type | Description |
---|---|
thread_block_tile_2 |
tile_partition_32(thread_block)
The tiled_partition<tilesz>(parent) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)/tilesz) subgroups will be created, therefore the parent group size must be evenly divisible by the tilesz. The allow parent groups are thread_block or thread_block_tile<size>.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to native hardware sizes, 1/2/4/8/16/32. The size(parent) must be greater than the template Size parameter otherwise the results are undefined.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition<32>")]
public static thread_block_tile_32 tile_partition_32(thread_block group)
Parameters
Type | Name | Description |
---|---|---|
thread_block | group |
Returns
Type | Description |
---|---|
thread_block_tile_32 |
tile_partition_4(thread_block)
The tiled_partition<tilesz>(parent) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)/tilesz) subgroups will be created, therefore the parent group size must be evenly divisible by the tilesz. The allow parent groups are thread_block or thread_block_tile<size>.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to native hardware sizes, 1/2/4/8/16/32. The size(parent) must be greater than the template Size parameter otherwise the results are undefined.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition<4>")]
public static thread_block_tile_4 tile_partition_4(thread_block group)
Parameters
Type | Name | Description |
---|---|---|
thread_block | group |
Returns
Type | Description |
---|---|
thread_block_tile_4 |
tile_partition_8(thread_block)
The tiled_partition<tilesz>(parent) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)/tilesz) subgroups will be created, therefore the parent group size must be evenly divisible by the tilesz. The allow parent groups are thread_block or thread_block_tile<size>.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to native hardware sizes, 1/2/4/8/16/32. The size(parent) must be greater than the template Size parameter otherwise the results are undefined.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition<8>")]
public static thread_block_tile_8 tile_partition_8(thread_block group)
Parameters
Type | Name | Description |
---|---|---|
thread_block | group |
Returns
Type | Description |
---|---|
thread_block_tile_8 |
tiled_partition(coalesced_group, UInt32)
Coalesced group type overload: retains its ability to stay coalesced
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition")]
public static coalesced_group tiled_partition(coalesced_group parent, uint tilesz)
Parameters
Type | Name | Description |
---|---|---|
coalesced_group | parent | |
System.UInt32 | tilesz |
Returns
Type | Description |
---|---|
coalesced_group |
tiled_partition(thread_block, UInt32)
Thread block type overload: returns a basic thread_group for now (may be specialized later)
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition")]
public static thread_group tiled_partition(thread_block parent, uint tilesz)
Parameters
Type | Name | Description |
---|---|---|
thread_block | parent | |
System.UInt32 | tilesz |
Returns
Type | Description |
---|---|
thread_group |
tiled_partition(thread_group, UInt32)
The tiled_partition(parent, tilesz) method is a collective operation that partitions the parent group into a one-dimensional, row-major, tiling of subgroups.
A total of ((size(parent)+tilesz-1)/tilesz) subgroups will be created where threads having identical k = (thread_rank(parent)/tilesz) will be members of the same subgroup.
The implementation may cause the calling thread to wait until all the members of the parent group have invoked the operation before resuming execution.
Functionality is limited to power-of-two sized subgorup instances of at most 32 threads. Only thread_block, thread_block_tile<>, and their subgroups can be tiled_partition() in _CG_VERSION 1000.
Declaration
[IntrinsicFunction("cooperative_groups::tiled_partition")]
public static thread_group tiled_partition(thread_group parent, uint tilesz)
Parameters
Type | Name | Description |
---|---|---|
thread_group | parent | |
System.UInt32 | tilesz |
Returns
Type | Description |
---|---|
thread_group |