Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step