Skip to content

[AMD] Update Quark Quantization Pass for Quark 0.11 and VitisAI LLM Fusion Model Support#2364

Open
poganesh wants to merge 2 commits intomicrosoft:mainfrom
poganesh:npu_fusion_use_ep_v2
Open

[AMD] Update Quark Quantization Pass for Quark 0.11 and VitisAI LLM Fusion Model Support#2364
poganesh wants to merge 2 commits intomicrosoft:mainfrom
poganesh:npu_fusion_use_ep_v2

Conversation

@poganesh
Copy link
Contributor

Describe your changes

  • Updates the QuarkQuantization (torch) pass for Quark 0.11 API (from 0.10)
  • Adds full fusion optimization for LLM models where supported
  • Adds token fusion support for models where full fusion is not yet available
  • Adds GPT-OSS pre-quantized model support
  • Aligned with MS-AMD 3D release (2/17/26)

@poganesh poganesh changed the title [AMD] Update QuarkQuantization Pass (torch) for Quark 0.11 and VitisAI LLM Fusion Model Support [AMD] Update Quark Quantization Pass for Quark 0.11 and VitisAI LLM Fusion Model Support Mar 22, 2026
@poganesh
Copy link
Contributor Author

@devang-ml, @xieofxie could you please help review this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant