Install this package:
emerge -a sci-misc/llama-cpp
<pkgmetadata>
<maintainer type="person">
<email>iohann.s.titov@gmail.com</email>
<name>Ivan S. Titov</name>
</maintainer>
<use>
<flag name="blis">Build a BLIS backend</flag>
<flag name="flexiblas">Build a FlexiBLAS backend</flag>
<flag name="rocm">Build a HIP (ROCm) backend</flag>
<flag name="wmma">Use rocWMMA to enhance flash attention performance</flag>
<flag name="openblas">Build an OpenBLAS backend</flag>
<flag name="opencl">Build an OpenCL backend, so far only works on Adreno and Intel GPUs</flag>
<flag name="openssl">Use openssl to support HTTPS</flag>
<flag name="sycl">Build an Intel SYCL backend (Arc GPU, Intel CPU via
oneAPI). Requires a -fsycl-capable compiler (Intel icpx or clang++
with SYCL patches) installed separately.</flag>
<flag name="webui">Build the embedded llama-server web UI. Fetches
prebuilt assets from the upstream Hugging Face bucket at configure
time; disable for a server binary with only the HTTP API.</flag>
</use>
<upstream>
<remote-id type="github">ggml-org/llama.cpp</remote-id>
</upstream>
</pkgmetadata>
Manage flags for this package:
euse -i <flag> -p sci-misc/llama-cpp |
euse -E <flag> -p sci-misc/llama-cpp |
euse -D <flag> -p sci-misc/llama-cpp
| Flag | Description | 9999 | 0_pre9413 | 0_pre9404 |
|---|---|---|---|---|
| ( | ⚠️ | ✓ | ✓ | ✓ |
| ) | ⚠️ | ✓ | ✓ | ✓ |
| avx | ⚠️ | ✓ | ✓ | ✓ |
| avx2 | ⚠️ | ✓ | ✓ | ✓ |
| avx512f | ⚠️ | ✓ | ✓ | ✓ |
| avx512vbmi | ⚠️ | ✓ | ✓ | ✓ |
| blis | Build a BLIS backend | ✓ | ✓ | ✓ |
| bmi2 | ⚠️ | ✓ | ✓ | ✓ |
| cuda | Build the NVIDIA CUDA backend (requires CUDA Toolkit; nvcc is pinned to gcc-15 on this host) ⚠️ | ✓ | ✓ | ✓ |
| examples | ⚠️ | ✓ | ✓ | ✓ |
| f16c | ⚠️ | ✓ | ✓ | ✓ |
| flexiblas | Build a FlexiBLAS backend | ✓ | ✓ | ✓ |
| fma3 | ⚠️ | ✓ | ✓ | ✓ |
| openblas | Build an OpenBLAS backend | ✓ | ✓ | ✓ |
| opencl | Build an OpenCL backend, so far only works on Adreno and Intel GPUs | ✓ | ✓ | ✓ |
| openmp | Use OpenMP for parallel code ⚠️ | ⊕ | ⊕ | ⊕ |
| openssl | Use openssl to support HTTPS | ⊕ | ⊕ | ⊕ |
| rocm | Build a HIP (ROCm) backend | ✓ | ✓ | ✓ |
| sse4_2 | ⚠️ | ✓ | ✓ | ✓ |
| sycl | Build an Intel SYCL backend (Arc GPU, Intel CPU via oneAPI). Requires a -fsycl-capable compiler (Intel icpx or clang++ with SYCL patches) installed separately. | ✓ | ✓ | ✓ |
| vulkan | Enable Vulkan GPU backend ⚠️ | ✓ | ✓ | ✓ |
| webui | Build the embedded llama-server web UI. Fetches prebuilt assets from the upstream Hugging Face bucket at configure time; disable for a server binary with only the HTTP API. | ⊕ | ⊕ | ⊕ |
| wmma | Use rocWMMA to enhance flash attention performance | ✓ | ✓ | ✓ |
| Type | File | Size | Versions |
|---|---|---|---|
| DIST | llama-cpp-0_pre9404.tar.gz | 33964532 bytes | 0_pre9404 |
| DIST | llama-cpp-0_pre9413.tar.gz | 33975722 bytes | 0_pre9413 |
| Type | File | Size |
|---|---|---|
| DIST | ggml-org_models_tinyllamas_stories15M-q4_0-99dd1a73db5a37100bd4ae633f4cfce6560e1567.gguf | 19077344 bytes |