Now that MTP is merged… What’s the best outputs you’re getting on Qwen 3.6 35B on 2x3090s?

**What Happened:** A post was made on the British subreddit r/LocalLLaMA asking for feedback and comparisons between outputs from two different models—MTP-3.6,…

By AI Maestro May 17, 2026 1 min read
Now that MTP is merged… What’s the best outputs you’re getting on Qwen 3.6 35B on 2x3090s?

**What Happened:**
A post was made on the British subreddit r/LocalLLaMA asking for feedback and comparisons between outputs from two different models—MTP-3.6, which is now part of MPT after being merged into it, and a previous 27B model that ran on club 3090 GPUs. The goal was to explore what improvements might have been made with the new MTP merge and how they compare to earlier builds running on dual 3090s.

**Why It Matters:**
The post highlights an ongoing interest in evaluating different model configurations, particularly focusing on the performance gains from moving from a 27B model (which runs efficiently on CPU) to a more powerful 35B model that leverages GPU resources. This discussion is crucial for developers and enthusiasts who are looking to optimize their AI models for better performance or specific use cases.

**Takeaways:**
– The MTP merge has likely brought improvements in both efficiency and output quality.
– Users are keen on sharing their experiences with the new model, especially how it performs compared to previous iterations.
– There is a need for comparisons between different model configurations to determine which provides the best balance of speed and capability.


Originally published at reddit.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top