Seam Carving

Josh Hug
UC Berkeley
hug@cs.berkeley.edu
Special thanks to: Kevin Wayne and Maia Ginsburg (Princeton)

Assignment materials:

Overview

Given an image, it is sometimes desirable to scale that image in only one dimension, for example, making it narrower without changing the height. As an example, consider the 505 pixel wide x 287 pixel tall image below. Suppose that we'd like to resize it to 357 x 287 pixels.

The easiest approaches are to rescale the image or crop the image. However rescaling ruins the aspect ratio, making everything look squashed, whereas cropping removes image content (like the poor guys on the right).

Scaled Cropped

By contrast, seam carving (published in 2007 by Shai Avidan and Ariel Shamir) resizes images in a content aware manner, resulting in much better results for some types of images. Unimportant pixels (like the ocean separating the two people in the middle) are thrown away, while interesting pixels like the people and the island on the left are preserved.

Original Seam Carved

Despite its power and modernity, the seam carving algorithm is elegant and easy to describe. We start by converting an image to a 2D "energy matrix" (see spec for details) using a very simple function that gives high energy values to important pixels, and low energy to unimportant ones. For example, the energy of the original image on this page is shown below to the left. An exaggerated version of the energies is shown below and to the right.

Energy of Original Image Energy (Exaggerated)

We then find the path from the top to the bottom such that the total energy of the path is minimized. This can be done either using a simple dynamic programming approach or by casting it as a shortest paths problem on a directed graph:


Minimum Energy Vertical Seam

We then "shrink" the image by creating a copy of the original image, but excluding all of the seam pixels (highlighted in image above). Since the seam is exactly one pixel per row, the resulting image is one pixel narrower. To shrink the image further, we repeat the entire process as many times as we'd like, reducing the width by one each time.

Assignment Tasks

The students will perform three tasks, listed below:
  1. Energy Calculation. Given an image, calculate the energy matrix: A 2D array (double[][] in Java) representing the "importance" or "energy" of each pixel.
  2. Seam Identification. Given an energy matrix, find the lowest energy contiguous path from top to bottom (or left to right). Lends itself naturally to dynamic programming, though can be solved using a shortest paths algorithm (e.g. Dijkstra's algorithm). This is the meat of the assignment.
  3. Seam Removal. Given an image and a seam, generate a new image without the pixels in that seam.

Metadata

Summary Students resize images using a state-of-the-art but easy to understand algorithm that feels like magic.
Topics
2D and 3D arrays, dynamic programming or shortest paths in a weighted directed graph, image processing
Audience
Late CS1, CS2
Difficulty
Easy to moderate. Length of assignment can be reduced by providing energy calculation and/or seam removal code.
Strengths

  • Modern: Students implement a state of the art image processing algorithm that was invented in 2007.
  • Real Data: Students can use the algorithm to modify any image they'd like, including pictures of themselves, friends, house, pets, teacher, etc. The results on human faces can be... memorable.
  • Non-contrived 2D Arrays Puzzle: Students must iterate over 2D arrays in a practical, but non-trivial manner.
  • Tricky Code-Reuse Problem: Shrinking images horizontally vs. vertically should involve lots of code-reuse, but it is somewhat tricky to find the right way to avoid rewriting the same methods twice for the vertical v. horizontal case.
Weaknesses

  • Boring Graph Structure: If utilized as a shortest paths problem, the digraph is highly regular and thus the dynamic programming using 2D arrays is more natural.
  • Hard to Test: It is hard for students to know that their algorithm is working properly except on trivially small images. Debugging can be difficult.
Library Dependencies
Requires a library which can read in images as 3D arrays, and can display (or at least save) 3D arrays as images.
Variants

Students can try to find a way to expand images instead of shrinking them. They can do this either on their own (for stronger students), or by looking at the original SIGGRAPH paper or video, which provide key ideas.

The original SIGGRAPH publications (and subsequent followups) provides other ideas, including targeted removal of objects. By selecting a region of the image and forcing its energy to zero, objects can be removed during the SeamCarving process.

Default implementation involves recalculation of entire image energies everytime a seam is removed, resulting in quadratic performance. Using fancier data structures, it is possible to achieve a linear time seam removal.