Readings: Convolutional Neural Networks
CSC 262 - Computer Vision - Weinman
- Summary:
- Outline of the preparatory readings on learning and
convolutional neural networks.
The reading for this unit comes from
Foundations
of Computer Vision by Torralba, Isola, and Freeman (2024).
Although this reading assignment is not substantially longer than
many prior readings (at 28 pages), it spans several chapters in the
textbook on its way toward hitting five key ideas:
- Computer vision tasks can be solved by "fitting" functions to
example data: learning.
- When the task-function is differentiable, we learn by iteratively
taking small steps toward improvement: gradient descent.
- Multiple "layers" (compositions) of linear and non-linear functions
can learn any task: neural networks.
- With gradient descent, the chain rule of calculus enables efficient
learning in neural networks: backpropagation.
- Layer types that specifically leverage image properties excel at visual
tasks: ConvNets.
Keeping these key ideas in mind, you should be able to move somewhat
quickly through many sections of the reading that help flesh out the
connections among them. Focus on the text and the figures, rather
than getting too hung up on the equations. (Many illustrations help
to provide good intuitions for the equations; these
are worth
studying.)
| Sections | Pages | Topic |
| 9.1-9.3 | 137-140 | Learning with examples |
| 10.1-10.3 | 151-152 | Basic gradient descent |
| 12.1-12.5.0 | 175-180 | Multi-layer perceptron (neural network) |
| 14.1-14.4 | 199-204 | Backpropagation |
| 24.1-24.2.1 | 403-407 | Convolutional layers |
| 24.2.5-24.4.4 | 413-417 | Other layers: sampling, pooling, and norm |
The section on backpropagation is perhaps worth making the greatest
investment of time in understanding the mathematics. The ideas are
relatively simple on the surface, but worth mastery of the nuances
to solidify intuitions.