we really all are going to make it, aren’t we? 2×3090 setup.

“`html I recently came across a post on r/LocaLLaMA where someone shared their experience of setting up an AI model locally. The…

By AI Maestro May 14, 2026 1 min read
we really all are going to make it, aren’t we? 2×3090 setup.

“`html

I recently came across a post on r/LocaLLaMA where someone shared their experience of setting up an AI model locally. The key points were the successful implementation and performance improvements, particularly with the use of a 2×3090 GPU setup.

  • The writer noted significant speedups in processing power (PP/s) and tool calls per second (TK/s), achieving rates of over 4000 PP/s and 113 TK/s without NVLink, which they claim would make it even faster.
  • They highlighted the use of a Qwen 3.6 model with 27 billion parameters running on 48 GB VRAM, describing this as “almost-sonnet level” performance and much faster than cloud-based models.
  • The writer is now working on integrating their local AI for handling SSH sessions on Linux machines, indicating the practical applications of this new model setup.

This development suggests a promising future for local AI setups, especially with improvements in model efficiency and performance. The potential to reach frontier-class intelligence within smaller models in the near term is also noted as an exciting possibility.

“`

### Takeaways
– Local AI setups are now capable of achieving impressive speeds without relying on cloud services.
– Smaller models like Qwen 3.6 can offer “almost-sonnet level” performance, which was previously only possible with larger models in the cloud.
– Integrating local AI for tasks such as SSH sessions is becoming more feasible and practical.

Scroll to Top