Mojo module
warp
This module includes intrinsics for NVIDIA GPUs shuffle instructions.
Aliases
-
FULL_MASK = ((2 ** _resolve_warp_size()) + -1)
:
Structs
-
ReductionMethod
: Enumerates the supported reduction methods.
Functions
-
broadcast
: Broadcasts a SIMD value across the warp. -
lane_group_max
: Reduces a SIMD value to its maximum within a lane group. -
lane_group_max_and_broadcast
: Reduces and broadcasts max within a lane group. -
lane_group_min
: Reduces a SIMD value to its minimum within a lane group. -
lane_group_reduce
: Takes in an input function to computes warp shuffle based reduction operation. -
lane_group_sum
: Computes the sum within a lane group. -
lane_group_sum_and_broadcast
: Computes sum and broadcasts within a lane group. -
max
: Computes the maximum value across the warp. -
min
: Computes the minimum value across the warp. -
reduce
: Takes in an input function to computes warp shuffle based reduction operation. -
shuffle_down
: Copies values from other lanes in the warp. -
shuffle_idx
: Copies a value from a source lane to other lanes in a warp. -
shuffle_up
: Copies values from other lanes in the warp. -
shuffle_xor
: Copies values from between lanes (butterfly pattern). -
sum
: Computes the sum across the warp.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!