I have been making a lot of updates to my project, and I wanted to share them here.
TextGen (previously text-generation-webui, also known as my username oobabooga or ooba) has been in development since December 2022, before LLaMa and llama.cpp existed.
In the last two months, the project has evolved from a web UI to a no-install desktop app for Windows, Linux, and macOS with a polished UI. I have created a very minimal and elegant Electron integration for that. (Did you know LM Studio is also a web UI running over Electron? Not sure many people know that.)
You download a portable build from the releases page
Unzip it
Double-click textgen
A window appears
There is no installation, and no files are ever created outside the extracted folder. It’s fully self-contained. All your chat histories and settings are stored in a user_data folder shipped with the build.
There are builds for CUDA, Vulkan, CPU-only, Mac (Apple Silicon and Intel), and ROCm.
Some differentiating features:
Full privacy. Unlike LM Studio, it doesn’t phone home on every launch with your OS, CPU architecture, app version, and inference backend choices. Zero outbound requests.
ik_llama.cpp builds (LM Studio and Ollama only ship vanilla llama.cpp). ik_llama.cpp has new quant types like IQ4_KS and IQ5_KS with SOTA quantization accuracy.
Built-in web search via the ddgs Python library, either through tool-calling with the built-in web_search tool (works flawlessly with Qwen 3.6 and Gemma 4), or through an “Activate web search” checkbox that fetches search results as text attachments.
Tool-calling support through 3 options: single-file .py tools (very easy to create your own custom functions), HTTP MCP servers, and stdio MCP servers. You can enable confirmations so that each tool call shows up with approve/reject buttons before it executes. I have written a guide here.
The ability to create custom characters for casual chats, in addition to regular instruction-following conversations:
OpenAI and Anthropic compliant API with very strict spec compliance. It works with Claude Code: you can load a model and run ANTHROPIC_BASE_URL=http://127.0.0.1:5000 claude and it will work.
Accurate PDF text extraction using the PyMuPDF Python library.
trafilatura for web page fetching, which strips navigation and boilerplate from pages, saving a lot of tokens on agentic tool loops.
Chat templates are rendered through Python’s Jinja2 library, which works for templates where llama.cpp’s C++ reimplementation of jinja sometimes crashes.
I write this as a passion project/hobby. It’s free and open source (AGPLv3) as always:
– TextGen has evolved from a web UI to a no-install desktop app for Windows, Linux, and macOS.
– The project now supports CUDA, Vulkan, CPU-only, Mac (Apple Silicon and Intel), and ROCm builds.
– It includes built-in support for tool-calling via HTTP MCP servers, single-file .py tools, and stdio MCP servers with confirmations.
– TextGen maintains full privacy by not sending data to a server on every launch.
Originally published at reddit.com. Curated by AI Maestro.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.
Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behaviour or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.