In this lab, you will investigate some blurring kernels,
write a procedure to build your own Gaussian pyramid, and use it to
do an efficient multi-scale search.
Due:
Mon 3/1
Deliverables
Matlab script(s) used to make your comparisons and generate all figures
Another data type in Matlab for storing arrays of disparate types
and sizes is called the "cell". Like arrays, it's an arbitrary
N-dimensional structure, but the entries can be anything, rather
than just numbers or characters. By anything, we mean any Matlab value:
a number, a vector, a matrix, even another cell. These are created
using curly braces, rather than brackets.
A = [ 1 2 ; 3 4 ]; % Some matrix
b = [ 5 6 7 8 ]; % A vector
C = {A,b}; % A cell containing a matrix and a vector
Cells are also indexed using curly braces, instead of parentheses
like regular arrays.
C{1} % Value in the cell's first slot
C{end} % Value in the cell's last slot
length(C) % Length of the cell
Preparation
Szeliski defines several separable decimation/blurring kernels in
Table 3.4 of Section 3.4 Add these 1-D filter definitions and a version
that conglomerates them into a single cell to your workspace. (Note
that if the lines below wrap in the web browser, they should NOT wrap
in your code definitions.)
Note that the brick image is in color (which is what we'll want).
Exercises
A. Blurring, Subsampling, and Aliasing
Display the original bricks image. Zoom in so you can see the individual
pixels of a region of bricks. Does the image look well-sampled?
Choose one of the filters given above, say the binomial filter, which
is accessible as hBin and F{2}. Apply this filter
to both the rows and the columns of each color channel of the bricks
image to produce a color result. Recall that you can index (or assign)
an entire slice of an image as:
B(:,:,c)
Downsample your resulting filtered image by taking every other pixel
starting at 1. Don't forget to include all the color channels.
Display your image and zoom in so you can see the individual pixels
on a region of bricks. How does the sampling compare to the original
image?
Make a note of the region of the image you are looking at (i.e., the
row and column lower/upper bounds). It should be sufficient to see
individual pixels but also several rows of bricks and a few columns
of their staggered pattern.
Extract a color sub-image using your row and
column bounds.
Now you will generalize all the steps above to apply all the filters.
Write a for loop to iterate over all the filters in F,
applying the blurring and subsampling to the original brick image.
Inside your loop, use imshow to display the entire subsampled
image.
Note: You can use a combination of title and num2str
to help you keep clear which image corresponds to which filter. Alternatively,
you could use subplot.
Inside your loop, use imwrite
to save the color sub-image region you extracted in A.6.
Inspect your images. How do the various
filters do at preserving the image structure? Which do you feel is
best? Worst? Why? Do your comparisons depend on where in the image
you look?
B. Gaussian Pyramid
In this section you will write a procedure for creating a Gaussian
pyramid. If you don't remember the syntax for procedures, revisit
example at the bottom of the Matlab
Tutorial.
Create a file called gausspyr.m to place your function in.
Note, this is a function, not a script. This means it will take parameters
as arguments and/or return values. The body will be a new namespace
that is separate from the variables in your workspace.
Add a declaration line to your function so that it takes two parameters,
an image and the number of levels to compute, and returns one value,
the pyramid (which will be a cell).
Use gkern to create a 1-D Gaussian blurring kernel in the
body of your function.
What scale (standard deviation) should your Gaussian have to prevent
aliasing at the subsampling stage? Justify your choice in terms of
the frequencies it will eliminate.
Hint: You may wish to explore the magnitude of the Fourier
transform of (the two-dimensional version of) your kernel and include
this in your justification. You will not need to take the log
to compress the results.
Note: Recall from the previous lab (where you investigated
the box filter) that you can see the result of a filter on a larger
image by keeping the filter in the upper-left corner of a larger matrix
of zeros before transforming.
Following Forsyth and Ponce (Algorithm 8.1), add a for loop
to your function that continually blurs and downsamples the result
from the previous pyramid layer, adding each successive result to
a cell containing the pyramid. The original image will be the first
entry in your cell. You should arrange for conv2 to give
you back a result the same size as its image argument.
Note: You should assume that the image is grayscale (not
color) and consists of doubles.
Return your cell as the value of the function gausspyr.
Test your pyramid script on a grayscale version of the bricks image
and display some of the levels to be sure it is giving reasonable
results.
C. Multi-Scale Search
If you haven't already noticed, the sunflower image is rather large
and features flowers at many scales. In fact, the largest sunflower
is about 600 pixels across. It took approximately 20 seconds to run
a separable kernel of that size on this image; you probably don't
even want to think about what it would take a square kernel to run.
Thus, we need to use a pyramid to run the search for large structures
at coarser scales.
Recall from the previous lab that the Laplacian of Gaussian (LoG)
kernel can act as a "blob" detector. This is well-suited as
a template for finding sunflowers. Fortunately, it's even a separable
kernel:1
LoG(x,y)=
d2
dx2
G(x,y)+
d2
dy2
G(x,y).
Since convolution is a linear operator, we can take the convolution
of a sum by taking the sum of two convolutions!
Use gkern to create a 1-D Gaussian kernel of variance 42.
This will give a reasonable size for us to detect sunflower blobs.
Use gkern to create a 1-D Gaussian second-derivative kernel
with the same variance.
Use your procedure gausspyr to create a Gaussian pyramid
of the sunflower image with enough levels to render the largest sunflowers
sufficiently small at the coarsest scale to be detected by the LoG
operator.
Choose one level of your pyramid and apply LoG operator to
it. Note that you'll need to apply the Gaussian alone one dimension
and its second derivative along the other dimension, repeat the process
reversing the two kernels, and then adding the two results. You should
arrange for conv2 to give you back results that are the same
size as its image argument.
Display the image from your pyramid level.
In separate figure window, display the results of the LoG operator.
Recall that impixelinfo will show you both the coordinate
and brightness value of a moused-over region in a figure window. Use
this to guess an initial reasonable threshold for detecting a sunflower
"blob" from the LoG responses.
Create a thresholded, binary image by vectorizing a "greater than"
operation over your LoG responses.
Bring your pyramid level image figure window to the front, and then
type hold on to indicate that we want to keep the image in
that window. You can get a contour plot of a function, such as your
thresholded responses using the contour command. We will
use this to identify the regions detected using your threshold. The
command
contour(B, [1 1]);
says to draw a contour of B using only the value 1.
(Since this is a binary image, we only want to do the level that is
"on" at one.) Use this to add a contour of the detected regions
to your image.
Add a for loop to apply the LoG operator to every
level of your pyramid, displaying the image and the contour of the
thresholded version on top of it.
How is your threshold? Adjust it so that you can detect as many flowers
over all scales as you can.
Does your method reliably find sunflowers
at the coarser scales? Finer scales?
How does the false positive rate
change as the scale changes?
Save the figures from within the
loop using print.