A faithful clone of Karpathy’s llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct models.