For BF16/FP16, element size is 2 bytes.
Maps logical indices to physical indices.
You can access the following variables:
num_banks, bank_size, BLOCK_M, BLOCK_N, elem_size, group_height, group_widthHere is an example of a more sophisticated swizzle function (you can copy-paste to the box below).
function swizzle(m, n) { const elems_full_banks = (num_banks * bank_size) / elem_size; const rows_full_banks = Math.max(elems_full_banks / BLOCK_N, 1); const height_full_banks = elems_full_banks / group_width; const xor_pattern = Math.floor(m / rows_full_banks) % (height_full_banks / rows_full_banks); const new_n = n ^ (xor_pattern * group_width); return [m, new_n]; }