Mojo module

shuffle

This module includes intrinsics for NVIDIA GPUs shuffle instructions.

Aliases

lane_group_max:
lane_group_max_and_broadcast:
lane_group_min:
lane_group_reduce: Takes in an input function to computes warp shuffle based reduction operation.
lane_group_sum:
lane_group_sum_and_broadcast:
shuffle_down: Copies values from other lanes in the warp.
shuffle_idx: Copies a value from a source lane to other lanes in a warp.
shuffle_up: Copies values from other lanes in the warp.
shuffle_xor: Copies values from between lanes (butterfly pattern).
warp_broadcast:
warp_max:
warp_min:
warp_reduce: Takes in an input function to computes warp shuffle based reduction operation.
warp_sum: