Mojo function
make_swizzle
make_swizzle[num_rows: Int, row_size: Int, access_size: Int]() -> Swizzle
2D swizzle to avoid bank conflict. Access access_size elements in num_rows x row_size in shared memory tile. num_rows should be for minimun access pattern. E.g. store 16x8 mma result to a 64 x 64 tile. The minimum access pattern is 8x8 sub-matrix, num_rows = 8, row_size = 64. We should swizzle the layout to avoid bank conflict for loading in the data in future. The load is most likely 16B, i.e. access_size = 4 for fp32 and 8 for bf16.
make_swizzle[type: DType, mode: TensorMapSwizzle]() -> Swizzle
Return swizzle functor based on input swizzle mode.
The supported modes are 32B, 64B, 128B, or none. Note that the swizzle swaps 16B vectors. We need to convert that into number of elements based on data type.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!