oobabooga/textgen
A fully local LLM runner that installs on your own machine and gets out of the way. Load models from the major open-weight families, run text generation, process images, and invoke tools, all without a single API call leaving your network. The OpenAI and Anthropic-compatible API layer means your existing code, prompts, and middleware connect with minimal changes. No cloud subscription, no usage metering, no data leaving the building.
For founders building on top of open-weight models, this removes the serving infrastructure problem entirely during development and prototyping. You get a web UI for quick experimentation alongside the API for programmatic use, which is a practical combination when you are moving fast and switching models often.
The honest reservation is that hardware requirements are real. Serious models need serious VRAM, and setup can still get fiddly depending on your GPU and operating system. Budget time for that.
-> Best for: privacy-conscious builders and indie hackers prototyping LLM features who want local inference without rolling their own serving stack.