Best Local LLMs – Apr 2026
We’re back with another Best Local LLMs Megathread!
Since our last thread, we’ve enjoyed the release of Qwen3.5 and Gemma4 series. Additionally, GLM-5.1 has shown SOTA-level performance, Minimax-M2.7 is now accessible at home as a Sonnet variant, and PrismML Bonsai 1-bit models are performing exceptionally well.
What You Are Running Right Now
Share what you are running right now and why.
The standard spiel:
- Only open weights models
Please thread your responses in the top level comments for each Application below to enable readability.
Applications
- General: Includes practical guidance, how-to guides, encyclopedic Q&A, search engine replacements or augmentations.
- Agentic/Agentic Coding/Tool Use/Coding
- Creative Writing/RP
- Speciality
If a category is missing, please create a top-level comment under the Speciality comment.
Notes
A useful breakdown of how people are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d
Bonus points
For additional insight, please breakdown/classify your recommendation by model memory footprint:
- Unlimited: >128GB VRAM
- XL: 64 to 128GB VRAM
- L: 32 to 64GB VRAM
- M: 8 to 32GB VRAM
- S: <8GB VRAM
Key Takeaways
- The landscape of local LLMs continues to evolve with new models like Qwen3.5 and Gemma4.
- GLM-5.1 has demonstrated SOTA performance, making it a standout player in the field.
- Minimax-M2.7 is now accessible at home as a more user-friendly variant of its predecessor.
- The variety of models available across different memory footprints continues to cater to diverse use cases and resource constraints.
We have continued feasting in the months since the previous thread with the much anticipated release of Qwen3.5 and Gemma4 series. If that wasn’t enough, we are having some scarcely believable moments with GLM-5.1 boasting SOTA level performance, Minimax-M2.7 being the accessible Sonnet at home, PrismML Bonsai 1-bit models that actually work etc.
The standard spiel:
- Only open weights models
If a category is missing, please create a top-level comment under the Speciality comment.
A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d
For additional insight, please breakdown/classify your recommendation by model memory footprint:
- Unlimited: >128GB VRAM
- XL: 64 to 128GB VRAM
- L: 32 to 64GB VRAM
- M: 8 to 32GB VRAM
- S: <8GB VRAM
Lead with what it means for makers and artists when discussing the impact of these new models on creative workflows.
If a category is missing, please create a top-level comment under the Speciality comment.
A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d
For additional insight, please breakdown/classify your recommendation by model memory footprint:
- Unlimited: >128GB VRAM
- XL: 64 to 128GB VRAM
- L: 32 to 64GB VRAM
- M: 8 to 32GB VRAM
- S: <8GB VRAM
Let’s hear from you – what models are you running, and why?
Share your experiences with these new LLMs in the comments below.
If a category is missing, please create a top-level comment under the Speciality comment.
A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d
For additional insight, please breakdown/classify your recommendation by model memory footprint:
- Unlimited: >128GB VRAM
- XL: 64 to 128GB VRAM
- L: 32 to 64GB VRAM
- M: 8 to 32GB VRAM
- S: <8GB VRAM
Let’s hear from you – what models are you running, and why?
If a category is missing, please create a top-level comment under the Speciality comment.
A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d
For additional insight, please breakdown/classify your recommendation by model memory footprint:
- Unlimited: >128GB VRAM
- XL: 64 to 128GB VRAM
- L: 32 to 64GB VRAM
- M: 8 to 32GB VRAM
- S: <8GB VRAM
Let’s hear from you – what models are you running, and why?
If a category is missing, please create a top-level comment under the Speciality comment.
A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd
Originally published at reddit.com. Curated by AI Maestro.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.


![Anthropic posted a profit while xAI burned $4.2B. The AI profitability numbers finally leaked.[D]](https://ai-maestro.online/wp-content/uploads/2026/05/anthropic-posted-a-profit-while-xai-burned-4-2b-the-ai-profi-768x768.jpg)
![Hebbian architecture AI model [R]](https://ai-maestro.online/wp-content/uploads/2026/05/hebbian-architecture-ai-model-r-768x768.jpg)
![AgentLantern: exposing the hidden graph of AI agent projects [P]](https://ai-maestro.online/wp-content/uploads/2026/05/agentlantern-exposing-the-hidden-graph-of-ai-agent-projects-768x768.jpg)