Swizzle and Shuffle Weights before calling FlashInfer's Fused MoE Kernel
We discuss how to swizzle and shuffle the experts weights before calling FlashInfer's fused MoE kernel to satisfy the memory layout requirements.
We discuss how to swizzle and shuffle the experts weights before calling FlashInfer's fused MoE kernel to satisfy the memory layout requirements.