amd-gaia
- Ebuilds: 3, Testing: 0.19.0 Description:
GAIA is AMD's open-source agent framework for local AI agents on
Ryzen AI hardware (NPU + iGPU). It orchestrates LLM-driven workflows
over any OpenAI-compatible inference endpoint, with built-in
integrations for Docker, Jira, code-search, RAG, MCP servers, and
Whisper / Kokoro voice pipelines. The reference local backend is
Lemonade Server (sci-ml/lemonade); GAIA itself is hardware-agnostic
so long as the upstream LLM API is OpenAI-compatible.
Homepage:https://github.com/amd/gaia License: MIT
bitnet
- Ebuilds: 1, Snapshot: 9999 Description:
BitNet is the official inference framework for 1-bit LLMs (BitNet b1.58).
It provides optimized kernels for fast and lossless inference of 1.58-bit
quantized models on CPU, with architecture-specific optimizations for
x86 (TL2) and ARM (TL1) platforms.
Homepage:https://github.com/microsoft/BitNet License: MIT
clearml
- Ebuilds: 1, Testing: 1.18.0 Description: Auto-Magical CI/CD to streamline your AI workload
Homepage:https://clear.ml/docs License: Apache-2.0
comfyui
- Ebuilds: 4, Snapshot: 9999 Description: ComfyUI - The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Homepage:https://github.com/comfyanonymous/ComfyUI License: GPL-3.0
fastflowlm
- Ebuilds: 5, Testing: 0.9.43-r1, Snapshot: 9999 Description:
FastFlowLM (FLM) is a lightweight LLM inference runtime purpose-built
for AMD Ryzen AI NPUs (XDNA2 architecture). It provides an Ollama-style
CLI and OpenAI-compatible server API for running language models entirely
on the NPU with no GPU or CPU compute required.
Supported hardware: Ryzen AI 300-series (Strix Point, Strix Halo),
400-series (Gorgon Point), and Z2 Extreme. XDNA1 (Ryzen AI 7000/8000)
is NOT supported.
The orchestration code and CLI are MIT-licensed. NPU compute kernels
(xclbins) are proprietary binaries, free for commercial use under
$10M annual company revenue.
Homepage:
https://fastflowlm.com/
https://github.com/FastFlowLM/FastFlowLM
License: MIT FastFlowLM-Binary
gaia
- Ebuilds: 2, Testing: 0.17.6-r1 Description:
AMD Gaia is an AI agent framework that provides tools for building
and deploying AI agents. It includes support for various AI models
and frameworks, with integration for AMD hardware acceleration.
Homepage:https://github.com/amd/gaia License: MIT
kokoros
- Ebuilds: 1, Snapshot: 9999 Description:
Kokoros is a Rust implementation of the Kokoro-82M text-to-speech
model. Provides the `koko` CLI and an OpenAI-compatible HTTP server
used as the kokoro:cpu backend by sci-ml/lemonade.
Tracks upstream lucasjinreal/Kokoros directly. The lemonade-sdk
fork only diverges in CI infrastructure plus a bundled espeak-ng-data
copy that ::gentoo already provides via app-accessibility/espeak-ng,
so source-build users get the same binary either way.
Runtime model files (kokoro-v1.0.onnx + voices-v1.0.bin) are not
bundled — see pkg_postinst for a quick fetch recipe.
Homepage:https://github.com/lucasjinreal/Kokoros License: Apache-2.0
lemonade (ambiguous, available in 2 overlays)
- Ebuilds: 11, Testing: 10.6.0, Snapshot: 9999 Description:
Lemonade is a local LLM inference platform built on the Lemonade SDK.
It provides a server daemon (lemond), a command-line client, and an
optional web interface. Supports multiple backends including CUDA,
OpenCL, and Vulkan for efficient model deployment.
Homepage:https://github.com/lemonade-sdk/lemonade License: Apache-2.0
ollama (ambiguous, available in 7 overlays)
- Ebuilds: 17, Testing: 0.23.3, 0.17.7, Snapshot: 9999 Description: Get up and running with Llama 3, Mistral, Gemma, and other language models.
Homepage:https://ollama.com License: MIT
ollama-amd
- Ebuilds: 7, Testing: 0.17.7 Description: Ollama - get up and running with large language models - additional amd package
Homepage:https://ollama.com/ License: MIT
ollama-bin (ambiguous, available in 2 overlays)
- Ebuilds: 5, Testing: 0.24.0, Snapshot: 9999 Description:
Ollama is a tool for running large language models (LLMs) locally on your
machine. It provides a simple interface to download, run, and manage models
like Llama 3.2, Mistral, Gemma, and many others.
This is a binary distribution package that installs pre-built binaries from
the official Ollama releases. The binaries are provided under the MIT license
and include GPU acceleration support for both NVIDIA (CUDA) and AMD (ROCm)
graphics cards.
Key features:
- Easy model management with pull, push, and create commands
- Built-in API server for programmatic access
- GPU acceleration support (CUDA and ROCm)
- Efficient memory management with automatic model loading/unloading
- Support for multiple models and concurrent requests
- Compatible with OpenAI API format
Models are stored in /var/lib/ollama and can range from 2GB (3B parameters)
to 40GB+ (70B parameters) in size. GPU acceleration significantly improves
inference speed but requires compatible hardware.
Security Note: This package installs pre-compiled binaries. Security
hardening features (ASLR, PIE, stack protections) depend on upstream's
build configuration. The service runs as a dedicated 'ollama' user with
restricted permissions for defense in depth.
Homepage:https://ollama.com/ License: MIT
pyannote-audio
- Ebuilds: 1, Testing: 4.0.4 Description:
pyannote.audio is a deep-learning toolkit for speaker diarization,
voice activity detection, overlapped speech detection, and speaker
embedding. The Python package alone does not include any pretrained
models; running pretrained pipelines such as
pyannote/speaker-diarization-3.1 requires accepting the model terms
on HuggingFace and authenticating with a HuggingFace token.
Homepage:
https://github.com/pyannote/pyannote-audio
https://pyannote.github.io/
https://pypi.org/project/pyannote-audio/
License: MIT
sherpa-onnx
- Ebuilds: 1, Testing: 1.13.2 Description:
sherpa-onnx is a speech-stack toolkit from the k2-fsa project:
speech-to-text, text-to-speech, speaker diarization, voice activity
detection, source separation, and keyword spotting, all running on
ONNX Runtime (no PyTorch dependency).
Source build against system sci-libs/onnxruntime. For the prebuilt
-bin alternative (faster install, ships upstream's manylinux wheels)
see sci-ml/sherpa-onnx-bin.
The CMake build vendors a dozen small deps (eigen, asio, cargs, json,
kaldi-{decoder,native-fbank,fst}, openfst, kissfft, simple-sentencepiece,
hclust-cpp, optionally espeak-ng + piper-phonemize + portaudio +
websocketpp + pybind11) via FetchContent. The ebuild pre-fetches them
all via SRC_URI and stages into ${S} for the cmake fallback paths;
no network access during build.
Runtime model files for each task (ASR, diarization, TTS, etc.) live
upstream — see https://k2-fsa.github.io/sherpa/onnx/pretrained_models/
Homepage:
https://k2-fsa.github.io/sherpa/onnx/
https://github.com/k2-fsa/sherpa-onnx
License: Apache-2.0
sherpa-onnx-bin
- Ebuilds: 1, Testing: 1.13.2 Description:
sherpa-onnx is a speech-stack toolkit from the k2-fsa project:
speech-to-text, text-to-speech, speaker diarization, voice activity
detection, source separation, and keyword spotting, all running on
ONNX Runtime (no PyTorch dependency). Suited to CPU-only deployment
and embedded targets.
This -bin ebuild ships upstream's manylinux wheels (sherpa-onnx-core
for the C++ shared libraries plus a per-CPython-ABI wheel for the
Python bindings). Runtime model files are not bundled — see the
post-install message for download pointers.
Homepage:
https://k2-fsa.github.io/sherpa/onnx/
https://github.com/k2-fsa/sherpa-onnx
https://pypi.org/project/sherpa-onnx/
License: Apache-2.0
tensorzero
- Ebuilds: 4, Testing: 2026.5.0, Snapshot: 9999 Description:
TensorZero is an open-source stack for industrial-grade LLM applications.
Gateway: access every LLM provider through a unified API low latency
Observability: monitor your LLM systems, programmatically or with a UI
Optimization: optimize your prompts, models, and inference strategies
Evaluations: benchmark individual inferences or end-to-end workflows
Experimentation: deploy with built-in A/B testing, fallbacks, etc.
Homepage:https://www.tensorzero.com/ License: Apache-2.0
tokenizers (ambiguous, available in 2 overlays)
- Ebuilds: 6, Testing: 0.23.1 Description: Implementation of today's most used tokenizers
Homepage:https://github.com/huggingface/tokenizers License: Apache-2.0
Apache-2.0 Apache-2.0-with-LLVM-exceptions BSD-2 BSD ISC MIT MPL-2.0
Unicode-DFS-2016