Readings: Convolutional Neural Networks

CSC 262 - Computer Vision - Weinman



Summary:
Outline of the preparatory readings on learning and convolutional neural networks.
The reading for this unit comes from Foundations of Computer Vision by Torralba, Isola, and Freeman (2024).
Although this reading assignment is not substantially longer than many prior readings (at 28 pages), it spans several chapters in the textbook on its way toward hitting five key ideas:
  1. Computer vision tasks can be solved by "fitting" functions to example data: learning.
  2. When the task-function is differentiable, we learn by iteratively taking small steps toward improvement: gradient descent.
  3. Multiple "layers" (compositions) of linear and non-linear functions can learn any task: neural networks.
  4. With gradient descent, the chain rule of calculus enables efficient learning in neural networks: backpropagation.
  5. Layer types that specifically leverage image properties excel at visual tasks: ConvNets.
Keeping these key ideas in mind, you should be able to move somewhat quickly through many sections of the reading that help flesh out the connections among them. Focus on the text and the figures, rather than getting too hung up on the equations. (Many illustrations help to provide good intuitions for the equations; these are worth studying.)
 
Sections Pages Topic
9.1-9.3 137-140 Learning with examples
10.1-10.3 151-152 Basic gradient descent
12.1-12.5.0 175-180 Multi-layer perceptron (neural network)
14.1-14.4 199-204 Backpropagation
24.1-24.2.1 403-407 Convolutional layers
24.2.5-24.4.4 413-417 Other layers: sampling, pooling, and norm
 
The section on backpropagation is perhaps worth making the greatest investment of time in understanding the mathematics. The ideas are relatively simple on the surface, but worth mastery of the nuances to solidify intuitions.
Copyright © 2010, 2012, 2015, 2019, 2020, 2022 Jerod Weinman.
ccbyncsa.png