(Roadmap — content to be written)
This chapter covers:
Loss functions and gradient descent
The chain rule as a graph traversal
Computational graphs and automatic differentiation
Backpropagation through a convolution layer
Learning rate, momentum, Adam
Depends on: Chapter 9 Builds toward: Chapter 11 (CNNs)