Developmental changes in learning resemble stochastic optimization


Analogies to stochastic optimization are common in developmental psychology, describing a gradual reduction in randomness (“cooling off”) over the lifespan. Yet for lack of concrete empirical comparison, there is ambiguity in how to interpret this analogy. Using data from n = 281 participants ages 5 to 55, we show that “cooling off” does not only apply to the single dimension of randomness. Rather, development resembles an optimization process along multiple dimensions of learning (i.e., reward generalization, uncertainty-directed exploration, and random temperature). What begins as large tweaks in the parameters that define learning during childhood, plateaus and converges in adulthood as we “learn to learn” more effectively. The developmental trajectory of human parameters is strikingly similar to several stochastic optimization algorithms, yet we observe intriguing differences in convergence. Notably, none of the optimization algorithms discovered reliably better regions of the strategy space than adult participants, suggesting a remarkable efficiency of human development.