A character-level language diffusion model for text generation. The model is a modified version of the NanoChat GPT implementation and is trained on Tiny Shakespeare! It’s only 10.7 million parameters, so you can try it locally!

# Clone the repository
git clone <repository-url>
cd tiny-diffusion
# Install dependencies (Python 3.10+)
uv syncfile training.py puts on weight weights/diffusion_model.ptSample and animation files load the model from this file,
Currently, the weight is already provided for you! It took me about half an hour to run this model for 20,000 steps on 4xA100s. But if you want to retrain the model, run:
# Train from scratch on Shakespeare
uv run training.py
# Training will save checkpoints to weights/diffusion_model.ptTo generate a continuous stream of output (currently 30 context lengths), run:
# Generate samples using the pre-trained model
uv run sample.pyvisualize the diffusion process
To see the propagation process as a nice animation, run:
# Watch the denoising process step-by-step
uv run animations/diffusion-process.py
# See Game of Life-inspired sampling (fun little experiment)
uv run animations/game-of-life.py- parameters: 10.7 million
- layers:6
- Attention Chiefs:6
- embedding dim:384
- sequence length:256 characters
- diffusion phase: 128
tiny-diffusion/
├── model.py # Core diffusion transformer
├── training.py # Training script
├── sample.py # Text generation
├── data/
│ └── tiny_shakespeare.txt # Training data
├── weights/
│ └── diffusion_model.pt # Pre-trained weights
└── animations/
├── diffusion-process.py # Denoising visualization
└── game-of-life.py # Game of Life sampling
