Mojo function
mha_decoding_single_batch_amd
mha_decoding_single_batch_amd[output_type: DType, q_type: DType, k_t: MHAOperand, v_t: MHAOperand, mask_t: MHAMask, group: Int, config: MHAConfig, sink: Bool = False](output: UnsafePointer[Scalar[output_type]], q: UnsafePointer[Scalar[q_type]], k: k_t, v: v_t, exp_sum_ptr: UnsafePointer[Scalar[get_accum_type[q_type]()]], qk_max_ptr: UnsafePointer[Scalar[get_accum_type[q_type]()]], seq_len: Int, num_keys: Int, num_partitions: Int, scale: Float32, batch_idx: Int, start_pos: Int, mask: mask_t, sink_weights: OptionalReg[NDBuffer[q_type, 1, MutableAnyOrigin]])
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!