neural / parameter-space
Don't show one answer. Show the space of all answers.
Up and down the ladder of abstraction.
A neural network with fixed architecture produces different decision boundaries depending on three parameters: weight scale, activation steepness, and output bias. Most visualizations show you one boundary at one setting. This page shows you all of them — the entire space of possible behaviors, so you can see which parameters matter and where the sharp transitions happen.
Bret Victor's key insight: understanding comes from moving between levels of abstraction. At the bottom: a single concrete instance. At the top: the full parameter space. In between: sweeps that show how one parameter changes the result while others stay fixed.
This page has three views. Instance shows one decision boundary — hover to read the network's output at any point. Sweep overlays 20 boundaries across one parameter range, revealing the envelope of possible behaviors. Heat maps accuracy across two parameter axes simultaneously — the full landscape of what this architecture can and cannot learn.
2-layer neural network on XOR data · 4 hidden neurons with tanh activation · three explorable parameters: weight scale (capacity), steepness (how digital the activation is), bias (shifts the decision) · instance view for details, sweep view for envelopes, heat view for the full landscape · presets jump to interesting configurations · hover for live readouts
Instance: The Illusion of Understanding
One boundary looks clean. Feels comprehensible. But it's one point in a high-dimensional space. Change weight scale by 0.3 and the boundary reshapes entirely. Understanding a network at one configuration is not understanding the network.
Sweep: The Envelope of Behavior
Sweeping steepness from 0.2 to 8 reveals a phase transition: below ~1, boundaries are smooth curves. Above ~3, they snap to piecewise-linear decision surfaces. The transition is sharp. A sweep makes it visible. A single instance hides it.
Heat: The Geography of Capacity
The accuracy heatmap reveals that XOR requires both sufficient weight scale AND sufficient steepness — the high-accuracy region is a triangle, not a rectangle. Low weights can't separate XOR regardless of steepness. High steepness can't compensate for low weights. Both axes must clear a threshold. The heat view makes this interaction legible at a glance.
why this matters for interpretability
Most ML interpretability tools show you one model at one checkpoint. Feature visualizations at one layer. Attention maps for one input. Saliency for one prediction. Each is a single point in a vast space of possible configurations.
The ladder of abstraction says: show the point, then show the sweep, then show the space. A feature visualization that also shows how it changes across layers, regularization strengths, or training epochs tells you something fundamentally different from a static snapshot. The space view turns correlation into mechanism.