Mojo function
topk_fused_sampling_gpu
topk_fused_sampling_gpu[type: DType, rank: Int, out_idx_type: DType, //](ctx: DeviceContext, K: Int, input: NDBuffer[type, rank, origin], out_idxs: NDBuffer[out_idx_type, rank, origin], block_size: OptionalReg[Int] = OptionalReg[Int]({:i1 0, 1}), num_blocks_per_input: OptionalReg[Int] = OptionalReg[Int]({:i1 0, 1}), temperature: SIMD[type, 1] = __init__[__mlir_type.!pop.int_literal](1))
Top K algorithm with fused sampling. Returns the sampled indices from the Top-K of the innermost dimension of the input tensor for each row/subvolume.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!