Mojo function
quantize_dynamic_scaled_fp8
quantize_dynamic_scaled_fp8[out_dtype: DType, in_dtype: DType, scales_dtype: DType, //, group_size_or_per_token: Int](scaled_output: NDBuffer[out_dtype, 2, origin, shape, strides], scales: NDBuffer[scales_dtype, 2, origin, shape, strides], input: NDBuffer[in_dtype, 2, origin, shape, strides], scale_ub: SIMD[float32, 1], ctx: DeviceContext)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!