Mojo function
tc_reduce_gevm_4x
tc_reduce_gevm_4x[out_type: DType, in_type: DType, simd_width: Int](val1: SIMD[in_type, simd_width]) -> SIMD[out_type, simd_width]
Using Tensor Cores to do warp level reduction.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!