CS180 Project 1

🛠

Vivek Bharati

Goal

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. A cool explanation on how the Library of Congress created the color images on their site is available here.

Image Preprocessing

For preprocessing, I divided each input image into three segments of equal size (R, G, B channels). Furthermore, for each channel, I cropped 10 percent of the channel height from both the top and bottom. This cropping was done to reduce the effect of black borders that could impact the vertical alignment (i.e. drive the similarity score down unnecessarily).

Naïve Solution

For my naïve solution with lower resolution (.jpg) inputs, I used the SSIM (skimage.metrics.structural_similarity) method to score various alignments. To produce different alignments, I searched over a window of (-30, 30) x (-30, 30) pixels. In other words, I used np.roll to shift the red and green channels over their widths and lengths (by up to 30 pixels in either direction). I selected the alignments for the red and green channels that maximized the SSIM index with the blue channel. My results, along with the corresponding red/green shifts are as follows:

monastery.jpg, R: (2, 3), G: (2, -3)

tobolsk.jpg, R: (3, 6), G: (3, 3)

cathedral.jpg, R: (3, 12), G: (2, 5)

Pyramid

For my image pyramid technique, I started with searching a window of (-30, 30) x (-30, 30) pixels (similar to the naïve approach). I started with comparing scaled down channels (by a factor of 0.125). For further levels of the pyramid, I searched over a window of (-2, 2) x (-2, 2) pixels and doubled the scale factor for each level. For the bells & whistles addition, I also compared the scaled down channels using sk.filters.sobel to align the channels based on the edges present in each channel. I continued with SSIM as my alignment metric, and adjusted the shifts for each channel over each level of recursion. My results, along with the corresponding red/green shifts are as follows:

monastery.jpg, R: (0, -4), G: (2, 2)

tobolsk.jpg, R: (2, 2), G: (4, 6)

cathedral.jpg, R: (2, 4), G: (4, 12)

church.tif, R: (-4, 58), G: 4, 26)

emir.tif, R: (40, 106), G: (22, 50)

harvesters.tif, R: (14, 122), G: (16, 60)

icon.tif, R: (22, 90), G: (16, 42)

lady.tif, R: (12, 120), G: (8, 56)

onion_church.tif, R: (36, 108), G: (24, 52)

sculpture.tif, R: (-26, 140), G: (-10, 34)

self_portrait.tif, R: (36, 176), G: (28, 78)

three_generations.tif, R: (8, 106), G: (12, 50)

melons.tif, R: (12, 178), G: (10, 80)