Limitations of gradient descent as a principle of
brain function

Abstract: In this slightly unusual talk, I will raise a meta-level question about building computational theories of the brain. Specifically, I will investigate gradient descent as an algorithm to get from optimality principles (cost functions) to predictions for how the brain finds those optima. I will argue that a priori, the usual approach (where the time derivative or step in each parameter is proportional to the partial derivative of the cost function with respect to that parameter) is not a particularly natural choice, and that a proper definition of steepest descent requires the choice of a Riemannian metric. I will then suggest possible ways to incorporate this choice in the model-building process.