New Qwen3.6 27b Autoround Quant (int4) Best Recipe

“`html

A new best recipe for the Qwen model has been shared by a user on Reddit. The updated version, referred to as "autorund-best", uses more iterations to improve quality and performance on an RTX 5090 VLLM environment.

The new recipe is available via Hugging Face under the names webhie/Qwen3.6-27B-int4-AutoRound for the model and webhie/Qwen3.6-27B-int4-AutoRound-Code for the calibration dataset.
The token generation rate is 130-160 tps (without mtp) and 290-320 tps (with mtp 3).
To address any issues with other Qwen models, users are advised to try v11 from the provided link: froggeric/Qwen-Fixed-Chat-Templates.

“`

Source Read original →

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

New Qwen3.6 27b Autoround Quant (int4) Best Recipe

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Nvidia chases $200B CPU…

Kaximia on channeling aggression,…

MiniMax Releases MiniMax M3…