Skip to main content
Log in

Mojo function

make_swizzle

make_swizzle[num_rows: Int, row_size: Int, access_size: Int]() -> Swizzle

2D swizzle to avoid bank conflict. Access access_size elements in num_rows x row_size in shared memory tile. num_rows should be for minimun access pattern. E.g. store 16x8 mma result to a 64 x 64 tile. The minimum access pattern is 8x8 sub-matrix, num_rows = 8, row_size = 64. We should swizzle the layout to avoid bank conflict for loading in the data in future. The load is most likely 16B, i.e. access_size = 4 for fp32 and 8 for bf16.

make_swizzle[type: DType, mode: TensorMapSwizzle]() -> Swizzle

Return swizzle functor based on input swizzle mode.

The supported modes are 32B, 64B, 128B, or none. Note that the swizzle swaps 16B vectors. We need to convert that into number of elements based on data type.