Lab: Edge Detection
CSC 295 - Computer Vision - Weinman
- Summary:
- We calculate image gradients to find edges at multiple
scales, strengths, and orientations.
Deliverables
- The Matlab script used to make your comparisons and generate all figures
- (10 points) Horizontal and vertical partial derivatives (A.5,
A.7)
Note: You may wish to place these side-by-side in the same figure
with one caption.
- (10 points) Observations of the horizontal and vertical partial derivatives
(A.6, A.8)
- (10 points) Gradient magnitude image and observations (B.2)
- (10 points) Gradient orientation image and observations (C.4,
5)
- (10 points) Multi-scale gradient magnitude and orientation images
(E.2)
Note: Using the same figure with a single caption, you may wish
to place these images together in two rows, with the scale varying
horizontally across the page.
- (10 points) Observations of multi-scale gradient magnitudes (F.1)
- (10 points) Observations of multi-scale gradient orientations (F.2)
- (10 points) Multi-scale edge images (E.4)
Note: You may wish to arrange these images into a grid, with
scale changing along the vertical axis and the threshold changing
along the horizontal. Regardles of how they are arranged, be sure
your captions are clear.
- (10 points) Multi-scale edge observations (F.3)
- (10 points) Professionalism of write-up
Extra
- (5 points) Explanation of Laplacian zero-crossing strategy, Extra
1
- (5 points) Laplacian image, Extra 2
- (10 points) Zero-crossing (edge) image, Extra 3
Preparation
Load the following file from the MathLAN and convert it from 8-bit
to double values:
-
/home/weinman/courses/CSC295/images/bug.png
Exercises
A. Gradient Components
- Create a 1-D Gaussian kernel with variance 4 using gkern.
-
gauss = gkern(4);
- We can also create a 1-D first derivative of Gaussian kernel
with variance 4 by giving another argument to gkern that
specifies how many times we want to take the derivative:
-
dgauss = gkern(4,1);
(For completeness, you could have given the argument 0 in the previous
exercise to indicate taking the derivative zero times.)
- Recall that conv2 accepts two separable kernels for filtering:
one to operate along rows and the other to operate along columns.
Use your two Gaussian kernels to calculate the partial derivative
of the bug image along the rows (that is, the horizontal partial derivative).
Have it return an answer only for the valid portions of the convolution
(the last parameter should be 'valid' rather than 'same').
- Display your result as an image, remembering that the partial derivative
may be positive or negative. Thus, you will need to tell imshow
to relax its displaying conventions.
-
Save your image using a combination of
imwrite and imadj, a custom procedure that linearly
remaps low (high, resp.) values to 0 (1, resp.). For example,
-
imwrite(imadj(X),'mypic.png');
-
Interpret your result. Where is it bright?
Where is it dark? Where is it gray? In all three cases: why?
-
Calculate the partial derivative of
the image along the columns in a similar fashion. Display and
save your result.
-
Inspect your result, making similar observations
as in A.6.
B. Gradient Magnitude
- Create an image representing the magnitude of the gradient at each
pixel location. (Recall that the magnitude of a vector is the square
root of the sum of the squares of its components.)
-
Display, save, and inspect your image. Where
are the strongest responses? How do they correspond to the values
of the partial derivatives?
- It is typical to place a threshold on the gradient magnitude so that
the edge detection result is binary. Use a vectorized "greater than"
operation on the gradient magnitude and display your thresholded image.
You may wish to start with a threshold of 0.02 and adjust it to find
what you think is a good threshold for a nice edge image.
C. Gradient Orientation
- As we discussed in class, use atan2 to create an image representing
the orientation/direction of the gradient at each pixel location.
- Recall that the gradient orientation may be in the range [-pi,pi].
We can tell imshow to treat these as its bounds for display
(black to white) explicitly, i.e.,
-
imshow(X,[-pi pi]);
Display your orientation image in this fashion.
- Black and white values are not very intuitive for interpreting orientation.
Fortunately, Matlab has a built-in way of changing the way actual
image values are mapped to display values. Much like you've done before
manually, you can explicitly change the map using the command
colormap with a map argument. The procedure hsv
creates a circular color map that is useful for such visualizations.
Apply this to your figure:
-
colormap(hsv); % Change the map of the
% current figure to "hsv"
colorbar; % Add a color bar to the figure
% to aid interpretation
-
Use print to save your orientation
image. (Recall that afterward you may wish to use
-
$ convert -trim in.png out.png
to tighten up the boundaries.)
-
Spend some time analyzing the orientations. Where
do the colors indicate the gradient points horizontally? "Diagonally?"
Vertically? Be sure to distinguish direction (i.e., left versus right.)
Do these reconcile with the image contents?
D. Gradient Orientation Revisited
Having the orientation displayed as a bright color where there is
no strong edge is rather misleading. Instead, we'd like to be able
to display no colors where there is no edge, and have the colors on
the strong gradients indicate the orientation, as in Part C. We can
do this by using an alternative color representation. Instead of thinking
about the color contributions of red, green, and blue, components
(the RGB color model), we can separate the color into three different
components
- Hue:
- the pure chroma
- Saturation:
- the amount of color present
- Value:
- the perceived brightness
Like RGB, we can model HSV colors with three components, each in the
range 0-1. Matlab knows how to convert an image in HSV colorspace
into an RGB for display. We will use this to encode our orientation
image in a more meaningful fashion.
Color will still be used to encode orientation, but we will use the
saturation channel to encode the strength of the edge at each location.
Thus, if there is no edge, there will be no saturation and thus no
color. We can keep the value at a constant brightness, perhaps white.
- To represent the hue, rescale your previous orientation image so that
instead of the range [-pi,pi] it has
the range [0,1].
Hint: To rescale a quantity x from [a,b] to
[0,1], something you will do quite often, you
can use the transform
-
To represent the saturation, create a rescaled
version of the gradient magnitude image so that the maximum value
in the rescaled version is 1. Hint: Here we can simply divide
by the largest value.
- To represent the value, create an image of all ones the same size
as your other images.
- To create the final HSV image, concatenate these M×N matrices
along the third dimension into one M×N×3 image using
the cat procedure, i.e.,
-
W = cat(3,X,Y,Z);
-
Convert your new HSV image to an RGB image
using hsv2rgb, i.e.,
-
B = hsv2rgb(A);
E. Edge Detection and Scale
Now that you have analyzed all the parts, we want to investigate how
detecting edges depends on the scale (i.e., Gaussian standard deviation)
used to calculate the gradients.
- Place your commands for creating the gaussian kernels, partial derivatives,
gradient magnitude, and weighted orientation (the RGB version from
D.5) inside a for loop that calculates
these for the following Gaussian variances: 1, 2, 4, 16, and 32.
-
Inside your loop, add commands to save
your rescaled magnitude image (from D.2) and
color orientation image as PNG files.
Hint: To automatically create appropriate file names, you can
use num2str to convert from numbers to strings
as done in the image formation lab.
- We also need to threshold our edges so that we have binary detections.
Inside your loop, add for loop over several
gradient magnitude thresholds: [2/256], [4/256],
[8/256], [12/256].
Hint: For an easier to read file name, you may wish to loop
over the numerators and use the denominator when calculating the threshold.
-
The command subplot(m,n,p) breaks
a figure window into an m×n array of axes
and set the pth axis as the current. For instance, The following
table shows the values of p where m is 2 and n
is 3.
Inside your inner loop (over thresholds), add commands to
- activate an appropriate subplot in a 1×4 array of axes
- display the (binary) thresholded image
After your inner loop, use the print to save the figure's
array of (binary) thresholded images as a PNG file.
You should now have a total of 15 = 5 (magnitude) + 5 (orientation)
+ 5 (edge) images.
F. Analysis
There should not be any more Matlab work for you do. All that remains
is some analysis of your results.
-
How do the magnitude images change
as the scale increases?
-
How do the orientation images change
as the scale increases?
-
How do the detection images change with
scale and threshold? Note: these are not independent; consider them
together. What happens as each changes? (For example consider the
four extrema along both axes.)
Extra Credit: Thinning Edges
-
Szeliski suggests that the resulting
edges can be thinned by finding the zero-crossing of the Laplacian,
which is the sum of second derivatives in the x and y directions.
Explain in your own words why this strategy works.
-
Recalling that you can get an arbitrary Gaussian
derivative with gkern,
-
filter = gkern(scale, deriv)
calculate the Laplacian of the image at some appropriate scale.
-
Find the zero crossings of the image Laplacian
by generalizing the following 1-D zero-crossing strategy:
-
Aneg = A<=0; % Non-positive elements
Apos = A>0; % Positive elements
Aneg2pos = [0 Aneg(1:end-1)] & Apos; % (Right-shift negative) & positive
Apos2neg = Aneg & [0 Apos(1:end-1)]; % (Right-shift positive) & negative
Azcross = Aneg2pos - Apos2neg; % Crossing in either direction
Acknowledgments
The bug image was captured by Jerod Weinman (with much regret,
in his garden) and is Copyright 2007, licensed under a Creative
Commons Attribution-Noncommercial-Share Alike 3.0 United States License.
Copyright © 2010, 2012 Jerod
Weinman.
This work is licensed under a Creative
Commons Attribution-Noncommercial-Share Alike 3.0 United States License.