Install this package:
emerge -a sci-misc/llama-cpp
| Version | EAPI | Keywords | Slot |
|---|---|---|---|
| 9999 | 8 | ~amd64 | 0 |
| 0_pre9733 | 8 | ~amd64 ~arm64 | 0 |
| 0_pre9700 | 8 | ~amd64 ~arm64 | 0 |
| 0_pre9692 | 8 | ~amd64 ~arm64 | 0 |
<pkgmetadata>
<maintainer type="person">
<email>iohann.s.titov@gmail.com</email>
<name>Ivan S. Titov</name>
</maintainer>
<use>
<flag name="blis">Build a BLIS backend</flag>
<flag name="flexiblas">Build a FlexiBLAS backend</flag>
<flag name="rocm">Build a HIP (ROCm) backend</flag>
<flag name="wmma">Use rocWMMA to enhance flash attention performance</flag>
<flag name="openblas">Build an OpenBLAS backend</flag>
<flag name="opencl">Build an OpenCL backend, so far only works on Adreno and Intel GPUs</flag>
<flag name="openssl">Use openssl to support HTTPS</flag>
<flag name="sycl">Build an Intel SYCL backend (Arc GPU, Intel CPU via
oneAPI). Requires a -fsycl-capable compiler (Intel icpx or clang++
with SYCL patches) installed separately.</flag>
<flag name="webui">Build the embedded llama-server web UI. Fetches
prebuilt assets from the upstream Hugging Face bucket at configure
time; disable for a server binary with only the HTTP API.</flag>
</use>
<upstream>
<remote-id type="github">ggml-org/llama.cpp</remote-id>
</upstream>
</pkgmetadata>
Manage flags for this package:
euse -i <flag> -p sci-misc/llama-cpp |
euse -E <flag> -p sci-misc/llama-cpp |
euse -D <flag> -p sci-misc/llama-cpp
| Flag | Description | 9999 | 0_pre9733 | 0_pre9700 | 0_pre9692 |
|---|---|---|---|---|---|
| ( | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| ) | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| avx | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| avx2 | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| avx512f | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| avx512vbmi | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| blis | Build a BLIS backend | ✓ | ✓ | ✓ | ✓ |
| bmi2 | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| cuda | Build the CUDA (cuda_v13) llama-server GPU backend via <pkg>dev-util/nvidia-cuda-toolkit</pkg> ⚠️ | ✓ | ✓ | ✓ | ✓ |
| examples | Pull in <pkg>dev-python/jupyter</pkg> to run the bundled starter notebooks ⚠️ | ✓ | ✓ | ✓ | ✓ |
| f16c | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| flexiblas | Build a FlexiBLAS backend | ✓ | ✓ | ✓ | ✓ |
| fma3 | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| openblas | Build an OpenBLAS backend | ✓ | ✓ | ✓ | ✓ |
| opencl | Build an OpenCL backend, so far only works on Adreno and Intel GPUs | ✓ | ✓ | ✓ | ✓ |
| openmp | Use OpenMP for parallel code ⚠️ | ⊕ | ⊕ | ⊕ | ⊕ |
| openssl | Use openssl to support HTTPS | ⊕ | ⊕ | ⊕ | ⊕ |
| rocm | Build a HIP (ROCm) backend | ✓ | ✓ | ✓ | ✓ |
| sse4_2 | ⚠️ | ✓ | ✓ | ✓ | ✓ |
| sycl | Build an Intel SYCL backend (Arc GPU, Intel CPU via oneAPI). Requires a -fsycl-capable compiler (Intel icpx or clang++ with SYCL patches) installed separately. | ✓ | ✓ | ✓ | ✓ |
| vulkan | Build the Vulkan llama-server GPU backend ⚠️ | ✓ | ✓ | ✓ | ✓ |
| webui | Build the embedded llama-server web UI. Fetches prebuilt assets from the upstream Hugging Face bucket at configure time; disable for a server binary with only the HTTP API. | ⊕ | ⊕ | ⊕ | ⊕ |
| wmma | Use rocWMMA to enhance flash attention performance | ✓ | ✓ | ✓ | ✓ |
| Type | File | Size | Versions |
|---|---|---|---|
| DIST | llama-cpp-0_pre9692.tar.gz | 34936959 bytes | 0_pre9692 |
| DIST | llama-cpp-0_pre9700.tar.gz | 34936751 bytes | 0_pre9700 |
| DIST | llama-cpp-0_pre9733.tar.gz | 34959077 bytes | 0_pre9733 |
| Type | File | Size |
|---|---|---|
| DIST | ggml-org_models_tinyllamas_stories15M-q4_0-99dd1a73db5a37100bd4ae633f4cfce6560e1567.gguf | 19077344 bytes |