Best Local LLMs - Apr 2026

Best Local LLMs – Apr 2026

We’re back with another Best Local LLMs Megathread!

Since our last thread, we’ve enjoyed the release of Qwen3.5 and Gemma4 series. Additionally, GLM-5.1 has shown SOTA-level performance, Minimax-M2.7 is now accessible at home as a Sonnet variant, and PrismML Bonsai 1-bit models are performing exceptionally well.

What You Are Running Right Now

Share what you are running right now and why.

The standard spiel:

Only open weights models

Please thread your responses in the top level comments for each Application below to enable readability.

Applications

General: Includes practical guidance, how-to guides, encyclopedic Q&A, search engine replacements or augmentations.
Agentic/Agentic Coding/Tool Use/Coding
Creative Writing/RP
Speciality

If a category is missing, please create a top-level comment under the Speciality comment.

Notes

A useful breakdown of how people are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d

Bonus points

For additional insight, please breakdown/classify your recommendation by model memory footprint:

Unlimited: >128GB VRAM
XL: 64 to 128GB VRAM
L: 32 to 64GB VRAM
M: 8 to 32GB VRAM
S: <8GB VRAM

Key Takeaways

The landscape of local LLMs continues to evolve with new models like Qwen3.5 and Gemma4.
GLM-5.1 has demonstrated SOTA performance, making it a standout player in the field.
Minimax-M2.7 is now accessible at home as a more user-friendly variant of its predecessor.
The variety of models available across different memory footprints continues to cater to diverse use cases and resource constraints.

We have continued feasting in the months since the previous thread with the much anticipated release of Qwen3.5 and Gemma4 series. If that wasn’t enough, we are having some scarcely believable moments with GLM-5.1 boasting SOTA level performance, Minimax-M2.7 being the accessible Sonnet at home, PrismML Bonsai 1-bit models that actually work etc.

The standard spiel:

Only open weights models

If a category is missing, please create a top-level comment under the Speciality comment.

A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d

For additional insight, please breakdown/classify your recommendation by model memory footprint:
Unlimited: >128GB VRAM
XL: 64 to 128GB VRAM
L: 32 to 64GB VRAM
M: 8 to 32GB VRAM
S: <8GB VRAM

Lead with what it means for makers and artists when discussing the impact of these new models on creative workflows.

If a category is missing, please create a top-level comment under the Speciality comment.

A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d

For additional insight, please breakdown/classify your recommendation by model memory footprint:
Unlimited: >128GB VRAM
XL: 64 to 128GB VRAM
L: 32 to 64GB VRAM
M: 8 to 32GB VRAM
S: <8GB VRAM

Let’s hear from you – what models are you running, and why?

Share your experiences with these new LLMs in the comments below.

If a category is missing, please create a top-level comment under the Speciality comment.

A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d

For additional insight, please breakdown/classify your recommendation by model memory footprint:
Unlimited: >128GB VRAM
XL: 64 to 128GB VRAM
L: 32 to 64GB VRAM
M: 8 to 32GB VRAM
S: <8GB VRAM

Let’s hear from you – what models are you running, and why?

If a category is missing, please create a top-level comment under the Speciality comment.

A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d

For additional insight, please breakdown/classify your recommendation by model memory footprint:
Unlimited: >128GB VRAM
XL: 64 to 128GB VRAM
L: 32 to 64GB VRAM
M: 8 to 32GB VRAM
S: <8GB VRAM

Let’s hear from you – what models are you running, and why?

If a category is missing, please create a top-level comment under the Speciality comment.

A useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd
Source Read original →
Related reading
[R] Which LLMs are actually best for bleeding-edge Linux/ML debugging workflows in 2026? [R]
How to Run LLMs Locally with Ollama: The Complete 2026 Setup Guide
How to Run LLMs Locally with Ollama: The Complete 2026 Setup Guide
The SignalThe Signal: Edition 02Read this edition →Every Friday: the one AI story that actually mattered, plus the tools worth your time.
AI Maestro is an independent British AI publication. We test what we recommend, and we write it the way we would say it. More about us

Best Local LLMs – Apr 2026