Mojo module
memory
This module includes NVIDIA GPUs memory operations.
Aliases
-
AddressSpace = _GPUAddressSpace
:
Structs
Functions
-
async_copy
: Asynchronously copysize
amount of bytes from src global memory address to shared memorydst
address. -
async_copy_commit_group
: Commits all prior initiated but uncommitted cp.async instructions into a cp.async-group. -
async_copy_wait_all
: Wait for the completion of all commited cp.async-groups. -
async_copy_wait_group
: Wait for the completion ofn
or asynchronous copy operations. -
cp_async_bulk_tensor_global_shared_cta
: Initiates an asynchronous copy operation on the tensor data from shared cta memory to global memory. -
cp_async_bulk_tensor_reduce
: These instructions initiate an asynchronous reduction operation of tensor data in global memory with the tensor data in shared{::cta} memory, usingtile
mode. -
cp_async_bulk_tensor_shared_cluster_global
: Initiates an asynchronous copy operation on the tensor data from global memory to shared memory. -
cp_async_bulk_tensor_shared_cluster_global_multicast
: Initiates an asynchronous multicast load operation on the tensor data from global memory to shared memories of the cluster. -
external_memory
: Gets a pointer to dynamic shared memory. -
fence_mbarrier_init
: Fence that applies on the prior mbarrier.init. -
fence_proxy_tensormap_generic_sys_acquire
: Acquires tensor map system's memory fence of particular size Args: ptr: Pointer to tensor map object in system's memory. size: The size of the object. -
fence_proxy_tensormap_generic_sys_release
: Release tensor map system's memory fence. -
load
: -
tma_store_fence
: Fence for SMEM stores for subsequent TMA STORE.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!