Skip to main content

Mojo struct

AMDBufferResource

@register_passable(trivial) struct AMDBufferResource

Fields

  • desc (SIMD[DType.uint32, 4]): 128-bit descriptor for a buffer resource on AMD GPUs. Used for buffer_load/buffer_store instructions.

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, Movable, UnknownDestructibility

Aliases

__copyinit__is_trivial

alias __copyinit__is_trivial = SIMD[DType.uint32, 4].__copyinit__is_trivial

__del__is_trivial

alias __del__is_trivial = SIMD[DType.uint32, 4].__del__is_trivial

__moveinit__is_trivial

alias __moveinit__is_trivial = SIMD[DType.uint32, 4].__moveinit__is_trivial

Methods

__init__

__init__[dtype: DType](gds_ptr: UnsafePointer[Scalar[dtype], address_space=address_space, mut=mut, origin=origin], num_records: Int = Int.__init__[UInt32](SIMD[DType.uint32, 1](max_or_inf[DType.uint32]()))) -> Self

__init__() -> Self

get_base_ptr

get_base_ptr(self) -> Int

Returns:

Int

load

load[dtype: DType, width: Int, *, cache_policy: CacheOperation = CacheOperation(0)](self, vector_offset: Int32, *, scalar_offset: Int32 = 0) -> SIMD[dtype, width]

Returns:

SIMD

load_to_lds

load_to_lds[dtype: DType, *, width: Int = 1, cache_policy: CacheOperation = CacheOperation(0)](self, vector_offset: Int32, shared_ptr: UnsafePointer[Scalar[dtype], address_space=AddressSpace(3)], *, scalar_offset: Int32 = 0)

Loads data from global memory and stores to shared memory.

Copies from global memory to shared memory (aka LDS) bypassing storing to register.

Parameters:

  • dtype (DType): The dtype of the data to be loaded.
  • width (Int): The SIMD vector width.
  • cache_policy (CacheOperation): Cache operation policy controlling cache behavior at all levels.

Args:

  • vector_offset (Int32): Vector memory offset in elements (per thread).
  • shared_ptr (UnsafePointer): Shared memory address.
  • scalar_offset (Int32): Scalar memory offset in elements (shared across wave).

store

store[dtype: DType, width: Int, *, cache_policy: CacheOperation = CacheOperation(0)](self, vector_offset: Int32, val: SIMD[dtype, width], *, scalar_offset: Int32 = 0)

Stores a register variable to global memory with cache operation control.

Writes to global memory from a register with high-level cache control.

Note:

  • Only supported on AMD GPUs.
  • Provides high-level cache control via CacheOperation enum values.
  • Maps directly to llvm.amdgcn.raw.buffer.store intrinsics.
  • Cache control bits:
  • SC[1:0] controls coherency scope: 0=wave, 1=group, 2=device, 3=system.
  • nt=True: Use streaming-optimized cache policies (recommended for streaming data).

Parameters:

  • dtype (DType): The data type.
  • width (Int): The SIMD vector width.
  • cache_policy (CacheOperation): Cache operation policy controlling cache behavior at all levels.

Args:

  • vector_offset (Int32): Vector memory offset in elements (per thread).
  • val (SIMD): Value to write.
  • scalar_offset (Int32): Scalar memory offset in elements (shared across wave).

Was this page helpful?