Skip to main content
Log in

Mojo function

matmul

matmul[c_type: DType, a_type: DType, b_type: DType, //, use_tensor_core: Bool = False, transpose_b: Bool = False, elementwise_lambda_fn: OptionalReg[fn[DType, Int, Int](Index[2], SIMD[$0, $1]) capturing -> None] = OptionalReg[fn[DType, Int, Int](Index[2], SIMD[$0, $1]) capturing -> None]({:i1 0, 1}), config: OptionalReg[MatmulConfig[a_type, b_type, c_type, transpose_b]] = OptionalReg[MatmulConfig[a_type, b_type, c_type, transpose_b]]({:i1 0, 1}), _trace_description: StringSlice[StaticConstantOrigin] = __init__[__mlir_type.!kgen.string]("")](c: NDBuffer[c_type, 2, origin, shape], a: NDBuffer[a_type, 2, origin, shape], b: NDBuffer[b_type, 2, origin, shape], ctx: DeviceContext)

This implements the matmul kernel for the Blackwell architecture. Note that we do not currently have pure mojo kernels which would utilize blackwell architectures, so in place we just call the CUBLAS library.