Skip to main content
Log in

Mojo struct

TMATensorTileArray

@register_passable(trivial) struct TMATensorTileArray[num_of_tensormaps: Int, dtype: DType, cta_tile_layout: Layout, desc_layout: Layout]

An array of TMA descripotr.

Parameters

  • num_of_tensormaps (Int): Int The number of TMA descriptors aka tensor map.
  • dtype (DType): DType The data type of the tensor elements.
  • cta_tile_layout (Layout): Layout The layout of the tile in shared memory, typically specified as row_major.
  • desc_layout (Layout): Layout The layout of the descriptor, which can be different from the shared memory layout to accommodate hardware requirements like WGMMA.

Aliases

  • descriptor_bytes = 128: Size of the TMA descriptor in bytes. This is a constant value that represents the size of the TMA descriptor in bytes. It is used to calculate the offset of the TMA descriptor in the device memory.

Fields

  • tensormaps_ptr (UnsafePointer[SIMD[uint8, 1]]): A static tuple of pointers to TMA descriptors. This field stores an array of pointers to TMATensorTile instances, where each pointer references a TMA descriptor in device memory. The array has a fixed size determined by the num_of_tensormaps parameter.

    The TMA descriptors are used by the GPU hardware to efficiently transfer data between global and shared memory with specific memory access patterns defined by the layouts.

Implemented traits

AnyType, Copyable, ExplicitlyCopyable, Movable, UnknownDestructibility

Methods

__init__

__init__(out self, tensormaps_device: DeviceBuffer[uint8])

Initializes a new TMATensorTileArray.

Args:

  • tensormaps_device (DeviceBuffer[uint8]): Device buffer to store TMA descriptors.

__getitem__

__getitem__(self, index: Int) -> UnsafePointer[TMATensorTile[dtype, cta_tile_layout, desc_layout]]

Retrieve a TMA descriptor.

Args:

  • index (Int): Index of the TMA descriptor.

Returns:

UnsafePointer to the TMATensorTile at the specified index.