we really all are going to make it, aren’t we? 2×3090 setup.

“`html Reddit user /u/RedShiftedTime shared their excitement about a new setup that allows for running large language models (LLMs) locally, bypassing the…

By AI Maestro May 14, 2026 1 min read
we really all are going to make it, aren’t we? 2×3090 setup.

“`html

Reddit user /u/RedShiftedTime shared their excitement about a new setup that allows for running large language models (LLMs) locally, bypassing the need for cloud services. They highlighted improvements in performance and efficiency with local setups compared to previous experiences.

  • The user successfully ran an LLM on a dual-boot configuration where they installed Ubuntu alongside WSL2.
  • They observed significant increases in prompt processing speed (PP/s) from around 400 to over 4,000 and tool-call rate (TK/s) increasing from approximately 113 to an unspecified but presumably higher number of calls per second.
  • The user noted that this setup allowed for more efficient execution compared to previous methods, as indicated by the absence of NVLink which they had previously thought would improve performance further.

This development is seen as a major step towards making LLMs accessible and faster without relying on cloud infrastructure. The user expresses optimism about potential future advancements in smaller models achieving frontier-level intelligence within the next year.

“`

### Takeaways
– Local setups can achieve impressive performance improvements over previous methods.
– Smaller language models may reach frontier-class intelligence within the next 12 months.
– This development is expected to make LLMs more accessible and faster for users.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top