Skip to content

[MXFP4] Hard code dynamic_mxfp4_quant from aiter.ops.triton.quant#120

Open
Knarf04 wants to merge 1 commit intogpu-mode:mainfrom
Knarf04:main
Open

[MXFP4] Hard code dynamic_mxfp4_quant from aiter.ops.triton.quant#120
Knarf04 wants to merge 1 commit intogpu-mode:mainfrom
Knarf04:main

Conversation

@Knarf04
Copy link

@Knarf04 Knarf04 commented Mar 11, 2026

The dynamic_mxfp4_quant kernel in fp4_utils.py uses incorrect round-ties-up rounding (see ROCm/aiter#974, fixed for quant.py but not yet for fp4_utils.py, tracked in ROCm/aiter#2249).
This PR switches the reference kernel's quantization to explicitly use the patched dynamic_mxfp4_quant from aiter.ops.triton.quant, with scales shuffled separately via e8m0_shuffle.

CC: @msaroufim @danielhua23

@danielhua23
Copy link
Contributor

LGTM @msaroufim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants