Mojo module
sync
This module includes intrinsics for NVIDIA GPUs sync instructions.
Functions
-
barrier
: Performs a synchronization barrier on block (equivelent to__syncthreads
in CUDA). -
cluster_arrive
: -
cluster_sync
: -
cluster_wait
: -
cp_async_bulk_commit_group
: Commits all prior initiated but uncommitted cp.async.bulk instructions into a cp.async.bulk-group. -
cp_async_bulk_wait_group
: Causes the executing thread to wait until only N or fewer of the most recent bulk async-groups are pending and all the prior bulk async-groups committed by the executing threads are complete When N is 0, the executing thread waits on all the prior bulk async-groups to complete. -
mbarrier
: Makes the mbarrier object track all prior copy async operations initiated by the executing thread. -
mbarrier_arrive
: Commits the arrival of thead to a shared memory barrier. -
mbarrier_arrive_expect_tx_shared
: Performs an expect-tx operation on shared memory barrier. -
mbarrier_init
: Initialize shared memory barrier for N number of threads. -
mbarrier_test_wait
: Test waiting for the memory barrier. -
mbarrier_try_wait_parity_shared
: Waits for shared memory barrier till the completion of the phase or ticks expires. -
syncwarp
: Causes all threads to wait until all lanes specified by the warp mask reach the sync warp.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!