Colorizing the Prokudin-Gorskii Photo Collection

Color matching technique.

Linji (Joey) Wang

Last updated on May 9, 2023 5 min read Project

With CMEAS

Table of Contents

Colorizing the Prokudin-Gorskii Photo Collection

Overview

This project requires you to use image processing techniques to create a color image from the digitized Prokudin-Gorskii glass plate photographs, with the goal of producing an image with as few visual artifacts as possible. A digital picture of a glass plate, with its three channels arranged from top to bottom as BGR (see Figure 1), serves as the process’s input. Our task is to isolate the three channels and properly align them.

Figure 1. Digitized Glass Plate: Cathedral

Matching Metric

Image matching means to manipulate image to ensure the best results for your similarity metrics. Thus, it is important to choose an efficient and accurate image matching metrics. For this assignment, I explored Sum of Squared Differences (SSD) and Normalized Cross Correlation (NCC) functions. The objective is to maximize or minimize one of these functions by searching and manipulating the images.

Sum of Squared Differences SSD is calculated based on the following equation:

where I and H are function of two images and x,y are pixels positions.

Figure 2. Cathedral: SSD, Shift:[0, -25],[7, -25] time cost: 2.13s

Normalized Cross Correlation NNC is calculated based on the following equation:

where \mu I and\mu H are the luminance of each image.

Figure 3. Cathedral: NCC, Shift:[0, -25],[7, -25] time cost: 9.59s

Efficiency

In the above implementation, we construct two for loops to search for the best position in a small window. Comparing and searching image in a small image might work; however, it will take minutes to process a 3500 by 3500 image. We need additional algorithms to pursue efficiency:

Image Pyramid Search This is an efficient search algorithm that generally yields good results in less than 60 seconds for high resolution images—up to 9000 pixels. As mentioned above, this algorithm scales down the image by scale_factor times, being scale_factor an automated parameter based on the input image resolution. Given an initial window search, the algorithm searches from coarse to fine images, within the reduced window size, which decreases by a hyperparameter of window_size as the images become larger.

Let’s first see how long will it take if we process a 3000 by 3000 image without pyramid search.

Figure 4. Train: SSD w/o Pyramid, time cost: 1857s

Figure 5. Train: NCC w/o Pyramid, time cost: 4486s

Let’s apply the pyramid search to see how it affect the searching.

Figure 6. Train: SSD with Pyramid, time cost: 19.03s

Figure 7. Train: NCC with Pyramid, time cost: 39.41s

Wow, I believe that is a huge performance increase. But the image quality seems not good enough. We will introduce a few of methods in the following part.

The following are the results with the Pyramid method.

Bells & Whistles

Auto Border Crop

The borders seem very annoying: 1. They don’t contribute to the final results. 2. They can even mess our matching metric.

Let’s try to build a simple border corp algo to handle it.

1. Define what is a border. Border usually is a long vertical or horizontal distinction between two parts. We can use this attribute as a start.
1. Detect the distinctions. Since the distinction is a huge difference between two parts, they can surely be detected by the edge detector. We utilize Canny Edge Detector for this problem, and define the longgest horizontal/vertical edge as the border. If the edge counts more than 90% of the height or width, we define it as the border and use the ratio as the crop ratio.

Let’s see some images after we implementing the auto border crop.

Figure 18. Train: SSD with Pyramid, CROP, time cost: 34.05s

Figure 19. Train: NCC with Pyramid, CROP, time cost: 44.67s

The border got cropped before and after the image matching to ensure a clean image. However, the channels are still not matched very well.

Edge Detector

Maybe pixel intensity is not enough for the matching, other features like edges are also important. This time, we use Canny edges for SSD and NCC for image matching.

Figure 20. Train: SSD with Pyramid, CROP, Canny, time cost: 18.30s

Figure 21. Train: NCC with Pyramid, CROP, Canny, time cost: 43.35s

Extra Credit

Evolution Method: Covariance matrix adaptation evolution strategy (CMA-ES) The Exhaustive Search can only work for small images and small window size, due to its time complexity of O(windowsize)^2. I propose a novel optimization approach to search the best alignment with CMA-ES. This black-box optimization algorithm consists on treating the search space as a multivariate Gaussian, with varying mean and covariance matrix. The algorithm starts with a population P, initial x and y movement, and initial covariance. We select elites by keeping a fraction of the population of 10% at each iteration, and noise epsilon is added to covariance at each step. We let the algorithm run for several iterations based on image size and set an initial Gaussian variance corroesponding to the windowsize. We optimize using the following equations:

Let’s see this cool GIF to have an idea how the evolution method find the optimal values.

Figure 22. CMEAS parameter update process

Figure 23. CMEAS image update process

Figure 24. Train: SSD with Pyramid, Canny, CMEAS, CROP, time cost: 12.89s

Figure 25. Train: NCC with Pyramid, Canny, CMEAS, CROP, time cost: 25.77s

The CMEAS Evolution method brings 35% performance increase with more clear results.

Final Results

![cathedral](./data/cathedral.jpg_square_limit10_CROPTrue_CannyTrue_PyramidTrue_CMEASTrue_methodNCC_shift[[5, 2],[12, 1]]_time_cost 1.9317255020141602.jpg)

Figure 26. cathedral: shift[[5, 2],[12, 1]]_time_cost 1.93s