New Qwen3.6 27b Autoround Quant (int4) Best Recipe

“`html A new best recipe for the Qwen model has been shared by a user on Reddit. The updated version, referred to…

By AI Maestro May 12, 2026 1 min read
New Qwen3.6 27b Autoround Quant (int4) Best Recipe

“`html

A new best recipe for the Qwen model has been shared by a user on Reddit. The updated version, referred to as "autorund-best", uses more iterations to improve quality and performance on an RTX 5090 VLLM environment.

  • The new recipe is available via Hugging Face under the names webhie/Qwen3.6-27B-int4-AutoRound for the model and webhie/Qwen3.6-27B-int4-AutoRound-Code for the calibration dataset.
  • The token generation rate is 130-160 tps (without mtp) and 290-320 tps (with mtp 3).
  • To address any issues with other Qwen models, users are advised to try v11 from the provided link: froggeric/Qwen-Fixed-Chat-Templates.

“`

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top