Lab: Pyramids and Wavelets

CSC 262 - Computer Vision - Weinman



Summary:
You will use a Laplacian pyramid for image compression and explore the steerable pyramid representation.

Deliverables

Extras

Preparation

Load the venerable cameraman image and convert it to doubles for processing.

Exercises

A. Building a Pyramid

The command
[pyrValuespyrDims] = buildLpyr(imgheight);
from the pyramid toolbox builds a Laplacian pyramid of image img having height levels (including the low-pass band). The result pyrValues is a single 1-D vector containing the entire image pyramid, and pyrDims is a height×2 array giving the sizes of the images in the pyrValues vector.
  1. What should the value of the dimensions be for the cameraman image when the pyramid height is 4? (Note that you will have to know the size of the image.)
  2. How long should the pyramid vector be?
  3. Use buildLpyr to construct a 4 level Laplacian pyramid of the cameraman image and verify your answers to the previous two questions.
  4. The toolbox also features ways to extract the image corresponding to an individual levels from the pyramid, calculate the indices into the pyramid vector, and display the entire pyramid. The command
    showLpyr(pyrValuespyrDims)
    renders the pyramid on the current figure. Use this to display your Laplacian pyramid.

B. Taking a Pyramid Apart

  1. Which row in the pyramid dimensions corresponds to the low-pass band? How big is this image? How many pixels does it have?
  2. The last N entries in the pyramid vector, where N is the size (in pixels) of the low-pass band, correspond to the low-pass band itself. Use this information to create two separate vectors from your pyramid. One containing only the entries in the low-pass band, and the other containing all of the rest. The latter represents all of the high-pass bands of the various scales.
    Hint: You should probably use length and add the two vectors' dimensions to make sure you get back the length of the original pyramid values.
  3. What do you expect the histogram of the high-pass bands' values to look like?
  4. Use hist to create a histogram of your high-pass band vector. You will likely want to use more than the default number of bins.
  5. Does it have the shape you expect? Explain why or why not.

C. Compressing the Pyramid

In image compression, it is often useful to keep only the strongest visual responses. This typically means ignoring small changes because they are difficult to see. What does this mean for the Laplacian pyramid? Getting rid of values that are sufficiently close to zero by making them actually zero.
  1. The first step in eliminating values is figuring out what "sufficiently close to zero" means. Thus, it is useful to order the pyramid coefficients by their magnitude (absolute value). Use abs and sort to get a sorted version of your high-pass band coefficients' magnitude.
  2. What do these values look like when sorted? Use plot to investigate.
  3. Let us discard 80% of our coefficients. That means we need to find the value that is 80% of the way through our sorted vector to use as a threshold. Find this threshold. What is it?
    Hint: The length and round commands may be helpful.
  4. Use your threshold to set to zero any high-pass band coefficient whose magnitude is less than the threshold.
    Note: Think carefully! This will include both positive and negative numbers.
    Hint: You may wish to recall some fancy indexing work you did in the first Matlab lab.
  5. Use the command whos to determine how many bytes your high-pass band vector occupies. Note that this should be 8 bytes for every double.
  6. Matlab has a built-in sparse vector representation. Rather than storing every value, it stores only the non-zero values with an index that says where they are. Use the command
    sparseVec = sparse(fullVec);
    to transform your thresholded high-pass band vector into a sparse version. (You should give it a new name rather than overwrite the old value).
  7. Use the command whos to determine how many bytes your new sparse vector occupies. How do the two compare? As a ratio, how much space did you save? Is it as much as your method of choosing a threshold would suggest? Why or why not?

D. Reconstructing the Pyramid

Of course, it is one thing to compress an image, but another to make sure the compression preserves meaningful structure. We should probably reconstruct the image from the pyramid to see how it compares. The command
img = reconLpyr(pyrValuespyrDims);
inverts the buildLpyr operation by reconstructing the image from the pyramid.
  1. Take your thresholded high-pass band coefficients (the full version, not the sparse version) and concatenate them with the low-pass band vector you separated in B.2. (Be sure you put the low-pass band at the end as in the original.)
  2. Use the command reconLpyr to reconstruct the image from your compressed pyramid representation. As a baseline, you should also reconstruct the image from the original, unmodified Laplacian pyramid representation.
  3. Display both images.
  4. How does the reconstructed version look? Is it reasonable? Where is the reconstruction strongest? Where does it deviate most significantly from the original? Why?
  5. Beyond our qualitative visual impressions, it is also useful to have a quantitative measure of reconstruction. One possible metric is called the root mean-square (RMS) error, so called because it is the square root of the average squared difference between a true value xi and a corresponding estimated value yi:
    RMS=( 1

    N
    N

    i=1 
    (xi − yi)2)[1/2]
    Compute the RMS error between the original image and both reconstructed versions (the thresholded and non-thresholded).
  6. How do the RMS errors compare? How many 8-bit gray levels does the average error correspond to? Using the results from your image formation lab, how does this correspond to the average and worst-case noise of our cameras? Does this seem tolerable?

E. Steerable Pyramids

We mentioned above that there are a few varieties of steerable pyramids. Some use both even and odd basis functions (so that the resulting coefficients are complex numbers), while another uses only even functions. For simplicity, we will use the latter for now, avoiding complex numbers. The command
[pyrValuespyrDims] = buildSFpyr(img, heightorder);
builds a steerable pyramid representation of height levels, excluding the low-pass band where order is one less than the number of orientations used.
  1. What should the values of the dimensions be for the cameraman image when the pyramid height is 3? (Note that you will have to know the size of the image.)
  2. How long should the pyramid values vector be?
  3. Use buildSFpyr to build a 3 level steerable pyramid with 4 orientations from the cameraman and verify your answers to the previous two questions.
  4. The command
    showSpyr(pyrValuespyrDims)
    renders the pyramid on the current figure. Use this to display your steerable pyramid. Note that it omits one band.
  5. You can retrieve the image for a particular band number (as ordered in the pyramid's pyrDims matrix) using the command
    bandImage = pyrBand(pyrValuespyrDimsbandNum);
    Use this to extract the (non-oriented) high-pass band that is missing from the original display.
  6. What do you expect it to look like?
  7. Diplay and inspect the high-pass band image for completeness. Does it meet your expectations? Why or why not?
  8. Describe the filter responses in your steerable pyramid. What structures stand out for each orientation? Where are the fine scale responses strongest? Where are the coarse scale responses strongest?

Extras: Pyramid Tweaks

  1. Suppose you zeroed out the coefficients of all the horizontal and vertical oriented bands. What do you expect the image to look like? Why?
  2. You can extract the specific indices for a band using the command.
    bandIndices = pyrBandIndices(pyrDimsbandNum)
    Use this to set the pyramid values for the horizontally- and vertically-oriented bands at all scales to zero.
  3. Display the resulting reconstructed image. In what ways does it meet or not meet your explanations? Explain the results you do see (i.e., how it differs from the original).

Acknowledgments

We gratefully acknowledge use of Eero Simoncelli's Steerable Pyramid tools for Matlab.
Copyright © 2010, 2012, 2015, 2019 Jerod Weinman.
ccbyncsa.png
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 International License.