Mojo struct
TMATensorTileArray
@register_passable(trivial)
struct TMATensorTileArray[num_of_tensormaps: Int, dtype: DType, cta_tile_layout: Layout, desc_layout: Layout]
An array of TMA descripotr.
Parameters
- num_of_tensormaps (
Int
): Int The number of TMA descriptors aka tensor map. - dtype (
DType
): DType The data type of the tensor elements. - cta_tile_layout (
Layout
): Layout The layout of the tile in shared memory, typically specified as row_major. - desc_layout (
Layout
): Layout The layout of the descriptor, which can be different from the shared memory layout to accommodate hardware requirements like WGMMA.
Aliases
descriptor_bytes = 128
: Size of the TMA descriptor in bytes. This is a constant value that represents the size of the TMA descriptor in bytes. It is used to calculate the offset of the TMA descriptor in the device memory.
Fields
-
tensormaps_ptr (
UnsafePointer[SIMD[uint8, 1]]
): A static tuple of pointers to TMA descriptors. This field stores an array of pointers toTMATensorTile
instances, where each pointer references a TMA descriptor in device memory. The array has a fixed size determined by the num_of_tensormaps parameter.The TMA descriptors are used by the GPU hardware to efficiently transfer data between global and shared memory with specific memory access patterns defined by the layouts.
Implemented traits
AnyType
,
Copyable
,
ExplicitlyCopyable
,
Movable
,
UnknownDestructibility
Methods
__init__
__init__(out self, tensormaps_device: DeviceBuffer[uint8])
Initializes a new TMATensorTileArray.
Args:
- tensormaps_device (
DeviceBuffer[uint8]
): Device buffer to store TMA descriptors.
__getitem__
__getitem__(self, index: Int) -> UnsafePointer[TMATensorTile[dtype, cta_tile_layout, desc_layout]]
Retrieve a TMA descriptor.
Args:
- index (
Int
): Index of the TMA descriptor.
Returns:
UnsafePointer
to the TMATensorTile
at the specified index.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!