bitnet
- Ebuilds: 1
Description:
BitNet is the official inference framework for 1-bit LLMs (BitNet b1.58).
It provides optimized kernels for fast and lossless inference of 1.58-bit
quantized models on CPU, with architecture-specific optimizations for
x86 (TL2) and ARM (TL1) platforms.
Homepage:https://github.com/microsoft/BitNet License: MIT
caffe2 (ambiguous, available in 2 overlays)
- Ebuilds: 3, Testing: 2.11.0-r90 Description: A deep learning framework
Homepage:https://pytorch.org/ License: BSD
clearml
- Ebuilds: 1, Testing: 1.18.0 Description: Auto-Magical CI/CD to streamline your AI workload
Homepage:https://clear.ml/docs License: Apache-2.0
comfyui
- Ebuilds: 4, Testing: 9999 Description: ComfyUI - The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Homepage:https://github.com/comfyanonymous/ComfyUI License: GPL-3.0
fastflowlm
- Ebuilds: 4, Testing: 0.9.41 Description:
FastFlowLM (FLM) is a lightweight LLM inference runtime purpose-built
for AMD Ryzen AI NPUs (XDNA2 architecture). It provides an Ollama-style
CLI and OpenAI-compatible server API for running language models entirely
on the NPU with no GPU or CPU compute required.
Supported hardware: Ryzen AI 300-series (Strix Point, Strix Halo),
400-series (Gorgon Point), and Z2 Extreme. XDNA1 (Ryzen AI 7000/8000)
is NOT supported.
The orchestration code and CLI are MIT-licensed. NPU compute kernels
(xclbins) are proprietary binaries, free for commercial use under
$10M annual company revenue.
Homepage:
https://www.fastflowlm.com/
https://github.com/FastFlowLM/FastFlowLM
License: MIT FastFlowLM-Binary
fastprogress
- Ebuilds: 1, Testing: 1.0.3 Description: Simple and flexible progress bar for Jupyter Notebook and console
Homepage:https://fastprogress.fast.ai/ License: Apache-2.0
foxi
- Ebuilds: 1, Stable: 2021.05.27, Testing: 2021.05.27 Description: ONNXIFI with Facebook Extension
Homepage:https://github.com/houseroad/foxi/ License: MIT
kokoros
- Ebuilds: 1
Description:
Kokoros is a Rust implementation of the Kokoro-82M text-to-speech
model. Provides the `koko` CLI and an OpenAI-compatible HTTP server
used as the kokoro:cpu backend by sci-ml/lemonade.
Tracks upstream lucasjinreal/Kokoros directly. The lemonade-sdk
fork only diverges in CI infrastructure plus a bundled espeak-ng-data
copy that ::gentoo already provides via app-accessibility/espeak-ng,
so source-build users get the same binary either way.
Runtime model files (kokoro-v1.0.onnx + voices-v1.0.bin) are not
bundled — see pkg_postinst for a quick fetch recipe.
Homepage:https://github.com/lucasjinreal/Kokoros License: Apache-2.0
lemonade
- Ebuilds: 2, Testing: 10.3.0 Description:
Lemonade is a local AI server that exposes optimized LLMs through
OpenAI / Anthropic / Ollama compatible APIs, running inference on
AMD NPU and GPU. The C++ server core (lemond, lemonade-server) is
packaged here without the Tauri desktop wrapper or the bundled
web frontend; both are CMake-toggleable and can be added behind
USE flags later if needed.
Pairs with sci-ml/fastflowlm to drive the AMD Ryzen AI XDNA2 NPU
backend.
Homepage:
https://lemonade-server.ai/
https://github.com/lemonade-sdk/lemonade
License: Apache-2.0
ollama (ambiguous, available in 7 overlays)
- Ebuilds: 17, Testing: 9999, 0.17.7 Description: Get up and running with Llama 3, Mistral, Gemma, and other language models.
Homepage:https://ollama.com License: MIT
ollama-amd
- Ebuilds: 7, Testing: 0.17.7 Description: Ollama - get up and running with large language models - additional amd package
Homepage:https://ollama.com/ License: MIT
ollama-bin (ambiguous, available in 2 overlays)
- Ebuilds: 5, Testing: 9999 Description:
Ollama is a tool for running large language models (LLMs) locally on your
machine. It provides a simple interface to download, run, and manage models
like Llama 3.2, Mistral, Gemma, and many others.
This is a binary distribution package that installs pre-built binaries from
the official Ollama releases. The binaries are provided under the MIT license
and include GPU acceleration support for both NVIDIA (CUDA) and AMD (ROCm)
graphics cards.
Key features:
- Easy model management with pull, push, and create commands
- Built-in API server for programmatic access
- GPU acceleration support (CUDA and ROCm)
- Efficient memory management with automatic model loading/unloading
- Support for multiple models and concurrent requests
- Compatible with OpenAI API format
Models are stored in /var/lib/ollama and can range from 2GB (3B parameters)
to 40GB+ (70B parameters) in size. GPU acceleration significantly improves
inference speed but requires compatible hardware.
Security Note: This package installs pre-compiled binaries. Security
hardening features (ASLR, PIE, stack protections) depend on upstream's
build configuration. The service runs as a dedicated 'ollama' user with
restricted permissions for defense in depth.
Homepage:https://ollama.com/ License: MIT
tensorzero
- Ebuilds: 4, Testing: 9999 Description:
TensorZero is an open-source stack for industrial-grade LLM applications.
Gateway: access every LLM provider through a unified API low latency
Observability: monitor your LLM systems, programmatically or with a UI
Optimization: optimize your prompts, models, and inference strategies
Evaluations: benchmark individual inferences or end-to-end workflows
Experimentation: deploy with built-in A/B testing, fallbacks, etc.
Homepage:https://www.tensorzero.com/ License: Apache-2.0
tokenizers (ambiguous, available in 2 overlays)
- Ebuilds: 4, Testing: 0.22.1 Description: Implementation of today's most used tokenizers
Homepage:https://github.com/huggingface/tokenizers License: Apache-2.0
Apache-2.0 Apache-2.0-with-LLVM-exceptions BSD-2 BSD ISC MIT MPL-2.0
Unicode-DFS-2016
torchdata
- Ebuilds: 1, Testing: 0.11.0 Description: A repo for data loading and utilities on PyTorch domain libraries
Homepage:https://github.com/pytorch/data License: BSD