Get up and running with Kimi-K2.6, GLM-5.1, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
A high-throughput and memory-efficient inference and serving engine for LLMs
SOTA Open Source TTS
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
Tools for merging pretrained large language models.
Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
Maid is a free and open source application for interfacing with llama.cpp models locally, and with Anthropic, DeepSeek, Ollama, Mistral and OpenAI models remotely.
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
aider is AI pair programming in your terminal
Go with your own intelligence - Go applications that directly integrate llama.cpp for local inference using hardware acceleration.
Interface for OuteTTS models.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
MinimalChat is a lightweight, open-source chat application that allows you to interact with various large language models.
A font for writing tiny stories
llama.go is like llama.cpp in pure Golang!
A simple github actions script to build a llamafile and uploads to huggingface