A machine learning primer built from first principles. Written for engineers who want to reason about ML systems the same way they reason about software systems.
You are a strong engineer. You can create a software system on a whiteboard from your hard-earned mental models. You understand the tradeoffs – maintainability vs. beauty, performance vs. complexity.
You have skills in software design. You don’t have machine learning capabilities yet.
You know the tools are there, but you don’t realize when to access which ones. This primer builds on that intuition.
💡What makes it different
This is not a textbook or tutorial. this is a mental models – You need to reason about ML systems in the same way you already reason about software systems.
Every concept is covered in physical and engineering analogy: :
- Neurons as polarization filters
- paper folding depth
- Gradual flow as a pipeline valve
- Chain rule as gear train
- projection as shadow
These analogies are not decorative – they are the primary explanation, with the mathematics as supporting detail.
the focus is When to reach which tool and why – Not just what each tool does, but the design decision it represents and the tradeoffs it implies.
The primer is organized into three parts:
🧱 Part 1 – Basics
Neuron, structure (depth and width as paper folding), learning as optimization (derivatives, chain rules, backprops), generalization, and representation (features as instructions, superposition).
🏗️ Part 2 – Architecture
Combination rule families (dense, convolution, iteration, attention, graph ops, SSM), transformers in depth (self-attention, FFN as volumetric lookups, residual connections), encoding, learning rules beyond backprops, training frameworks (supervised, self-supervised, RL, GAN, propagation), and matching topology for the problem.
🚦 Part 3 – Gates as Control Systems
Gate primitives (scalar, vector, matrix), soft logic composition, branching and routing, recursion within the forward pass, and the geometric mathematics toolbox (projection, masking, rotation, interpolation).
Primer is a single Markdown file with inline visualization:
Go to a specific topic:
| Subject | what does that involve | |
|---|---|---|
| ✓ | neuron | Start here – dot product, bias, non-linearity |
| 📄 | composition | Which depth buys you – paper folding model |
| 📉 | Learn | Derivatives, Chain Rule, Backprop, Loss Scenario |
| 🎯 | generalization | Why do overparameterized networks work at all? |
| 🧠 | Representation | Directional,Features as Superposition |
| 🔀 | combination rules | Convolution vs Attention vs Repetition vs Graph vs SSM |
| 🤖 | Transformer | self-attention, ffn, residual connection |
| 🏋️ | framework | Supervised, Self-Supervised, RL, GAN, Propagation |
| 🗺️ | topology | Matching architecture to problem – implemented example |
| 🧩 | design pattern | Common problems → Which device to access |
| 🚦 | door | Practitioner’s Gating and Control Toolkit |
| 🔧 | diagnosis | Loss curve characteristics, sanity check, LR tuning |
The course complete topic map shows: curriculum.md
This primer was built through conversation – one concept at a time, stress-tested with questions on each until the mental model was fine. You can use it in two ways:
📚 solo reading
Read it front to back, section by section. When something doesn’t click, stop and re-read the section it depends on.
The primer is designed so that each section builds the load-bearing intuition for the next. Do not skip ahead – subsequent sections assume that you have not read but have absorbed the previous one.
💬Interactive exploration with AI agent
This is the more powerful approach, and closer to how Primer was actually built. Feed the primer (or part of it) to your favorite AI coding assistant and find out through conversation:
Read ml-primer.md. I'm an engineer learning ML fundamentals.
Walk me through the section on [topic]. I want to understand
it well enough to reason about design decisions, not just
recite definitions. Push back if I get something wrong.
Ask “why” questions. Propose wrong answers and see if the agent catches them. Ask for concrete examples. Ask what would happen if you changed one thing. Ask how the two concepts are related.
The primer gives both you and the agent a shared vocabulary and the right conceptual framework – conversations fill in everything a static document can’t.
Primer is the map. There is an area of discussion.
12 figures covering neurons, activation functions, paper folding, derivatives, chain rules, attention, FFN volumetric lookups, residual connections, dot products, loss scenarios, combinatorial rules and gating operations.
All generated from Python scripts scripts/. To revive:
python3 scripts/01_neuron_hyperplane.py
python3 scripts/02_activation_functions.py
# ... etc
Is necessary matplotlib And numpy.
Built through an extended conversational exploration between a software engineer and the cloud, where each concept was stress-tested through questions, analogies were iterated until they hit the ground running, and misconceptions were corrected in real time.
The result is closer to a distilled consultation than a reference document.
PR welcome. The goal is high signal – if you can explain a concept more clearly, fix an error, or add a section that fills a gap, open a PR.
Keep the tone:
- direct, concrete
- analogies on notation
- How it works When to use instead
MIT
<a href