LLM inference with 7x longer context. Pure C, zero dependencies. Lossless KV cache compression + single-header library.