This project explores gradient-domain processing, a simple technique
with a broad set of applications including blending, tone-mapping, and
non-photorealistic rendering. For the core project, we will focus on
"Poisson blending"; tone-mapping and NPR can be investigated as bells
and whistles.
The primary goal of this assignment is to seamlessly blend an object or
texture from a source image into a target image. The simplest method
would be to just copy and paste the pixels from one image directly into
the other. Unfortunately, this will create very noticeable seams, even
if the backgrounds are well-matched. How can we get rid of these seams
without doing too much perceptual damage to the source region?
The insight is that people often care much more about the gradient of an
image than the overall intensity. So we can set up the problem as
finding values for the target pixels that maximally preserve the
gradient of the source region without changing any of the background
pixels. Note that we are making a deliberate decision here to ignore
the overall intensity! So a green hat could turn red, but it will still
look like a hat.
We can formulate our objective as a least squares problem. Given the
pixel intensities of the source image "s" and of the target image "t",
we want to solve for new intensity values "v" within the source region
"S":
Here, each "i" is a pixel in the source region "S", and each "j" is a
4-neighbor of "i". Each summation guides the gradient values to match
those of the source region. In the first summation, the gradient is
over two variable pixels; in the second, one pixel is variable and one
is in the fixed target region.
The method presented above is called "Poisson blending". Check out the Perez et al. 2003 paper
to see sample results, or to wallow in extraneous math. This is just
one example of a more general set of gradient-domain processing
techniques. The general idea is to create an image by solving for
specified pixel intensities and gradients.
The implementation for gradient domain processing is not complicated,
but it is easy to make a mistake, so let's start with a toy example.
Reconstruct this image from its gradient values, plus one pixel
intensity. Denote the intensity of the source image at (x, y) as s(x,y)
and the value to solve for as v(x,y). For each pixel, then, we have
two objectives:
1. minimize (v(x+1,y)-v(x,y) - (s(x+1,y)-s(x,y)))^2
2. minimize (v(x,y+1)-v(x,y) - (s(x,y+1)-s(x,y)))^2
Note that these could be solved while adding any constant value to v, so we will add one more objective:
3. minimize (v(1,1)-s(1,1))^2
For 20 points, solve this in Matlab as a least squares problem. If your
solution is correct, then you should recover the original image.
Implementation Details
The first step is to write the objective function as a set of least
squares constraints in the standard matrix form: (Av-b)^2. Here, "A" is a
sparse matrix, "v" are the variables to be solved, and "b" is a known
vector. It is helpful to keep a matrix "im2var" that maps each pixel to
a variable number, such as:
[imh, imw, nb] = size(im);
im2var = zeros(imh, imw);
im2var(1:imh*imw) = 1:imh*imw;
Then, you can write objective 1 above as:
e=e+1;
A(e, im2var(y,x+1))=1;
A(e, im2var(y,x))=-1;
b(e) = s(y,x+1)-s(y,x);
Here, "e" is used as an equation counter. Note that the y-coordinate is
the first index in Matlab convention. As another example, objective 3
above can be written as:
e=e+1;
A(e, im2var(1,1))=1;
b(e)=s(1,1);
To solve for v, use
v = A\b; or
v = lscov(A, b);
Then, copy each solved value to the appropriate pixel in the output image.
Step 1: Select source and target regions. Select the boundaries of a
region in the source image and specify a location in the target image
where it should be blended. Then, transform (e.g., translate) the
source image so that indices of pixels in the source and target regions
correspond. I've provided starter code (getMask.m, alignSource.m) to
help with this. You may want to augment the code to allow rotation or
resizing into the target region. You can be a bit sloppy about
selecting the source region -- just make sure that the entire object is
contained. Ideally, the background of the object in the source region
and the surrounding area of the target region will be of similar color.
Step 2: Solve the blending constraints:
Step 3: Copy the solves values into your target image. For RGB images,
process each channel separately. Show at least three results of Poisson
blending. Explain any failure cases (e.g., weird colors, blurred
boundaries, etc.).
Tips
1. Initialize your sparse matrix with
sparse([], [], [], M, N, nzmax)
for a matrix with M equations and N variables and at most nzmax non-zero entries.
2. Before trying new examples, try something that you know should work,
such as the included penguins on top of the snow in the hiking image.
3. Object region selection can be done very crudely, with lots of room around the object.
Follow the same steps as Poisson blending, but use the gradient in
source or target with the larger magnitude as the guide, rather than the
source gradient:
Here "d_ij" is the value of the gradient from the source or the target
image with larger magnitude. Show at least one result of blending using
mixed gradients. One possibility is to blend a picture of writing on a
plain background onto another image.
To turn in your assignment, please email a summary web page and any supporting media (code, readme, etc) to sbu590@gmail.com. If the images are too large to send over email, you can include smaller versions with your webpage, and link to larger sized images you've hosted elsewhere online.
Use both words and images to show us what you've done. Please:
The core assignment is worth 100 points, as follows:
Special thanks to Derek Hoeim for allowing me to use this assignment.