Skip to main content
Log in

Mojo module

toppminp_gpu

Aliases

  • DEBUG_FILE = False:
  • SEED = 42:

Functions

  • min_p_sampling_gpu: GPU implementation of Min-P sampling for token selection. This function applies temperature scaling, softmax, a radix sort, and then samples tokens based on the calculated probability threshold (Min-P).
  • normalize:
  • normalize_u32:
  • radix_sort_pairs_kernel: Radix pair sort kernel for (default) descending order.
  • run_radix_sort_pairs_gpu:
  • top_p_sampling_gpu: GPU implementation of Top-P sampling for token selection. This function applies temperature scaling, softmax, a radix sort, and then samples tokens based on the cumulative probability mass (Top-P).
  • topk_wrapper: Copy of Kernels/mojo/nn/topk.mojo:_topk_stage1 with the addition of max_vals and p_threshold arguments to determine if sorting is needed for top-p/min-p sampling.
  • topp_minp_sampling_kernel: Top P-Min P sampling kernel.