Mojo function
matmul
matmul[c_type: DType, a_type: DType, b_type: DType, //, use_tensor_core: Bool = False, transpose_b: Bool = False, elementwise_lambda_fn: OptionalReg[fn[DType, Int, Int](Index[2], SIMD[$0, $1]) capturing -> None] = OptionalReg[fn[DType, Int, Int](Index[2], SIMD[$0, $1]) capturing -> None]({:i1 0, 1}), config: OptionalReg[MatmulConfig[a_type, b_type, c_type, transpose_b]] = OptionalReg[MatmulConfig[a_type, b_type, c_type, transpose_b]]({:i1 0, 1}), _trace_description: StringSlice[StaticConstantOrigin] = __init__[__mlir_type.!kgen.string]("")](c: NDBuffer[c_type, 2, origin, shape], a: NDBuffer[a_type, 2, origin, shape], b: NDBuffer[b_type, 2, origin, shape], ctx: DeviceContext)
This implements the matmul kernel for the Blackwell architecture. Note that we do not currently have pure mojo kernels which would utilize blackwell architectures, so in place we just call the CUBLAS library.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!