Skip to main content

Mojo function

quantize_dynamic_scaled_fp8

quantize_dynamic_scaled_fp8[out_dtype: DType, in_dtype: DType, scales_dtype: DType, input_shape: DimList, //, group_size_or_per_token: Int, input_hidden_size: Int](scaled_output: NDBuffer[out_dtype, 2, MutableAnyOrigin], scales: NDBuffer[scales_dtype, 2, MutableAnyOrigin], input: NDBuffer[in_dtype, 2, origin, input_shape], scale_ub: Float32, ctx: DeviceContext)

Was this page helpful?