Install this package:
emerge -a dev-python/vllm
If the package is masked, you can unmask it using the autounmask tool or standard emerge options:
autounmask dev-python/vllm
Or alternatively:
emerge --autounmask-write -a dev-python/vllm
| Version | EAPI | Keywords | Slot |
|---|---|---|---|
| 0.20.1 | 8 | ~amd64 | 0 |
<pkgmetadata>
<maintainer type="person">
<email>iohann.s.titov@gmail.com</email>
<name>Ivan S. Titov</name>
</maintainer>
<longdescription lang="en">
vLLM is a fast and easy-to-use library for LLM inference and
serving. Provides a Python API and an OpenAI-compatible HTTP server.
USE=cpu / cuda / rocm pick a single VLLM_TARGET_DEVICE for the
build (mutually exclusive). Default (none of the three) builds
with VLLM_TARGET_DEVICE=empty — Python entrypoints import cleanly,
backend kernels fail at first model load. Useful if you only want
the API surface for development.
</longdescription>
<use>
<flag name="cpu">Build for CPU inference (VLLM_TARGET_DEVICE=cpu); pull torchaudio + numba</flag>
<flag name="rocm">Build for AMD ROCm inference (VLLM_TARGET_DEVICE=rocm); pull HIP libs + torch{audio,vision}</flag>
</use>
<upstream>
<bugs-to>https://github.com/vllm-project/vllm/issues</bugs-to>
<remote-id type="pypi">vllm</remote-id>
<remote-id type="github">vllm-project/vllm</remote-id>
</upstream>
</pkgmetadata>
Manage flags for this package:
euse -i <flag> -p dev-python/vllm |
euse -E <flag> -p dev-python/vllm |
euse -D <flag> -p dev-python/vllm
| Flag | Description | 0.20.1 |
|---|---|---|
| cpu | Build for CPU inference (VLLM_TARGET_DEVICE=cpu); pull torchaudio + numba | ✓ |
| cuda | Enable CUDA GPU-accelerated model evaluation via <pkg>dev-python/pycuda</pkg> ⚠️ | ✓ |
| rocm | Build for AMD ROCm inference (VLLM_TARGET_DEVICE=rocm); pull HIP libs + torch{audio,vision} | ✓ |
| Type | File | Size | Versions |
|---|
| Type | File | Size |
|---|---|---|
| DIST | vllm-0.20.1.tar.gz | 33519792 bytes |