Skip to main content
Log in

Mojo struct

UMMAInsDescriptor

@register_passable(trivial) struct UMMAInsDescriptor[mma_kind: UMMAKind]

Descriptor for UMMA instructions.

This struct represents a descriptor that encodes information about UMMA instructions. The descriptor contains the following bit fields:

  • Sparsity (2 bits): The sparsity of the input matrices. Currently defaults to dense matrices.
  • Saturate for integer types (1 bits): Whether to saturate the result for integer types. Currently not supported.
  • Matrix D type (2 bits): Data type of matrix D.
  • Matrix A type (3 bits): Data type of matrix A.
  • Matrix B type (3 bits): Data type of matrix B.
  • Negate A matrix (1 bit): Whether to negate matrix A. Currently defaults to False.
  • Negate B matrix (1 bit): Whether to negate matrix B. Currently defaults to False.
  • Transpose A (1 bit): Whether to transpose matrix A.
  • Transpose B (1 bit): Whether to transpose matrix B.
  • N, Dimension of Matrix B (6 bits): Number of columns in matrix B. 3 LSBs are unused.
  • M, Dimension of Matrix A (6 bits): Number of rows in matrix A. 3 LSBs are unused.

See: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520mma#tcgen05-instuction-desc-kind-tf32-f16-f8f6f4

Parameters

  • mma_kind (UMMAKind): The kind of UMMA instruction.

Fields

  • desc (SIMD[uint32, 1]): The 32-bit descriptor value that encodes UMMA instruction information. This field stores the complete descriptor with all bit fields packed into a single 32-bit integer:
    • Bits 0-1: Sparsity selector(2 bits)
    • Bits 2: Sparsity enable(1 bit)
    • Bits 3: Saturate for integer types (1 bit)
    • Bits 4-5: Matrix D type (2 bits)
    • Bits 6: Reserved (1 bit)
    • Bits 7-9: Matrix A type (3 bits)
    • Bits 10-12: Matrix B type (3 bits)
    • Bits 13: Negate A matrix (1 bit)
    • Bits 14: Negate B matrix (1 bit)
    • Bits 15: Transpose A (1 bit)
    • Bits 16: Transpose B (1 bit)
    • Bits 17-22: N, Dimension of Matrix B (6 bits)
    • Bits 23: Reserved (1 bit)
    • Bits 24-28: M, Dimension of Matrix A (5 bits)
    • Bits 29: Reserved (1 bit)
    • Bits 30-31: Maximum shift while attempting B matrix (2 bits)

Implemented traits

AnyType, UnknownDestructibility

Methods

__init__

@implicit __init__(value: SIMD[uint32, 1]) -> Self

Initialize descriptor with raw 32-bit value.

This constructor allows creating a descriptor directly from a 32-bit integer that already contains the properly formatted bit fields for the descriptor.

Args:

  • value (SIMD[uint32, 1]): A 32-bit integer containing the complete descriptor bit layout.

create

static create[d_type: DType, a_type: DType, b_type: DType, output_shape: Index[2, element_type=uint32], /, *, transpose_a: Bool = False, transpose_b: Bool = False]() -> Self

Create a descriptor for UMMA instructions.

This function creates a descriptor for UMMA instructions based on the provided parameters.

Parameters:

  • d_type (DType): The data type of matrix D.
  • a_type (DType): The data type of matrix A.
  • b_type (DType): The data type of matrix B.
  • output_shape (Index[2, element_type=uint32]): The shape of the output matrix.
  • transpose_a (Bool): Whether to transpose matrix A.
  • transpose_b (Bool): Whether to transpose matrix B.

Returns:

A 32-bit integer containing the complete descriptor bit layout.