🔬 The most atomic GPT-2 implementation in 265 lines of pure Python & CUDA. A bilingual "Rosetta Stone" for understanding LLM internals from scratch. No dependencies, just math and kernels.
-
Updated
Mar 5, 2026 - HTML
🔬 The most atomic GPT-2 implementation in 265 lines of pure Python & CUDA. A bilingual "Rosetta Stone" for understanding LLM internals from scratch. No dependencies, just math and kernels.
Teaching Transformer Models facts about the world throughpretraining, and accessing that knowledge through finetuning.
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
A python package to experiment with GPT-like transformer models
This repository chronicles my journey through fundamental and advanced deep learning concepts. Each project is a battle-tested module combining rigorous theory with practical implementation.
Add a description, image, and links to the mingpt topic page so that developers can more easily learn about it.
To associate your repository with the mingpt topic, visit your repo's landing page and select "manage topics."