Mojo struct

ScatterGatherAmd

struct ScatterGatherAmd[thread_layout: Layout, num_threads: Int = thread_layout.size(), thread_scope: ThreadScope = ThreadScope(0), block_dim_count: Int = 1]

Tile-based AMD data movement delegate for scatter-gather operations.

This struct facilitates data movement between DRAM and registers on AMD GPUs using tile-based operations.

Parameters

thread_layout (Layout): The layout defining thread organization.
num_threads (Int): Total number of threads (defaults to thread_layout size).
thread_scope (ThreadScope): The scope of thread execution (block or warp).
block_dim_count (Int): Number of block dimensions.

Fields

buffer (AMDBufferResource):

Implemented traits

AnyType, UnknownDestructibility

Aliases

`delis_trivial`

alias __del__is_trivial = True

Methods

`init`

__init__(out self, tensor: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])

Initialize the scatter-gather delegate with a tensor.

Args:

tensor (LayoutTensor): The layout tensor to create an AMD buffer resource from.

`copy`

copy(self, dst_reg_tile: LayoutTensor[dtype, layout, origin, address_space=AddressSpace(5), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_gmem_tile: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_tensor: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], offset: OptionalReg[UInt] = None)

Copy data from DRAM to registers (local memory).

Args:

dst_reg_tile (LayoutTensor): Destination register tile in local address space.
src_gmem_tile (LayoutTensor): Source global memory tile.
src_tensor (LayoutTensor): Source tensor for the copy operation.
offset (OptionalReg): Optional offset for the copy operation.

copy(self, dst_gmem_tile: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], src_reg_tile: LayoutTensor[dtype, layout, origin, address_space=AddressSpace(5), element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])

Copy data from registers (local memory) to DRAM.

Args:

dst_gmem_tile (LayoutTensor): Destination global memory tile.
src_reg_tile (LayoutTensor): Source register tile in local address space.

Parameters​

Fields​

Implemented traits​

Aliases​

__del__is_trivial​

Methods​

__init__​

copy​

Parameters

Fields

Implemented traits

Aliases

`delis_trivial`

Methods

`init`

`copy`