Now that MTP is merged... What's the best outputs you're getting on Qwen 3.6 35B on 2x3090s?

**What Happened:**
A thread titled “Now that MTP is merged… What’s the best outputs you’re getting on Qwen 3.6 35B on 2x3090s?” was started in the r/LocalLLaMA subreddit. This post asks for feedback and comparisons of performance between a previous model (MTP) and the new merged version, specifically focusing on the output quality and speed when running Qwen 3.6 with two NVIDIA A100-40GB GPUs.

**Why It Matters:**
The discussion highlights the ongoing evolution and optimization efforts within AI models like Qwen. By comparing outputs between different versions of a model, users can gauge improvements in both utility and performance. This thread also underscores the importance of community engagement for refining and testing new AI models to ensure they meet user needs effectively.

– **Users are seeking insights into the impact of MTP merging on output quality and speed.**
– **There is interest in how this affects optimal configuration, such as layer splits and p/p (perplexity per second) outputs.**
– **The discussion emphasizes the need for continued testing and feedback to refine AI models for better performance and utility.**

Source Read original →

Now that MTP is merged… What’s the best outputs you’re getting on Qwen 3.6 35B on 2x3090s?

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Some of the nation’s…

Meituan Releases LongCat-2.0: A…

Amazon will stop accepting…