Three-Way Duel

Three architectures walk the same terrain. Same body, same physics, same ground. Different brains. Auto-rounds escalate from easy to brutal — watch which learning signal wins as the terrain gets harder.

Synthesis — cortex rewires CPG

Preset-Bank — drive recruits modules

Adaptive — perturb & observe

R1 / easy

Speed:|Terrain:

The Experiment

On flat ground, adaptive dominates. Its 120-frame eval window gives a clean gradient: did this perturbation help? Binary answer, direct parameter update. The cortex engines have 144 weights (synthesis) or 36 weights (preset-bank) learning from a scalar reward signal through eligibility traces — the credit assignment problem is severe. But hard terrain should change the balance. Gaps arrive faster than an eval window can react. Look-ahead matters: the cortex sees a gap 90px out and adjusts gait preemptively. Adaptive only discovers it fell 120 frames later. Watch the gap survival rate.

Synthesis

12 sensors, 8-neuron hidden layer, 6 CPG outputs. Hebbian plasticity with eligibility traces. Curriculum gating. The most degrees of freedom — and the hardest to converge.

Preset-Bank

8 sensors, 4-neuron hidden layer, 1 drive output. Five spinal modules (stand, creep, walk, trot, run) recruited by threshold with hysteresis. Less to learn, faster to commit.

Adaptive

Every 120 frames: perturb, measure, keep or revert. No neural network. The baseline both must beat.