sun, 13-mar-2016, 08:27

# Introduction

There are now 777 photos in my photolog, organized in reverse chronological order (or chronologically if you append /asc/ to the url). With that much data, it occurred to me that there ought to be a way to organize these photos by color, similar to the way some people organize their books. I didn’t find a way of doing that, unfortunately, but I did spend some time experimenting with image similarity analysis using color.

The basic idea is to generate histograms (counts of the pixels in the image that fit into pre-defined bins) for red, green and blue color combinations in the image. Once we have these values for each image, we use the chi square distance between the values as a distance metric that is a measure of color similarty between photos.

# Code

I followed this tutorial Building your first image search engine in Python which uses code like this to generate 3D RGB histograms (all the code from this post is on GitHub):

```import cv2

def get_histogram(image, bins):
""" calculate a 3d RGB histogram from an image """
if os.path.exists(image):

hist = cv2.calcHist([imgarray], [0, 1, 2], None,
[bins, bins, bins],
[0, 256, 0, 256, 0, 256])
hist = cv2.normalize(hist, hist)

return hist.flatten()
else:
return None
```

Once you have them, you need to calculate all the pair-wise distances using a function like this:

```def chi2_distance(a, b, eps=1e-10):
""" distance between two histograms (a, b) """
d = 0.5 * np.sum([((x - y) ** 2) / (x + y + eps)
for (x, y) in zip(a, b)])

return d
```

Getting histogram data using OpenCV in Python is pretty fast. Even with 32 bins, it only took about 45 minutes for all 777 images. Computing the distances between histograms was a lot slower, depending on how the code was written.

With 8 bin histograms, a Python script using the function listed above, took just under 15 minutes to calculate each pairwise comparison (see the rgb_histogram.py script).

Since the photos are all in a database so they can be displayed on the Internet, I figured a SQL function to calculate the distances would make the most sense. I could use the OpenCV Python code to generate histograms and add them to the database when the photo was inserted, and a SQL function to get the distances.

Here’s the function:

```CREATE OR REPLACE FUNCTION chi_square_distance(a numeric[], b numeric[])
RETURNS numeric AS \$_\$
DECLARE
sum numeric := 0.0;
i integer;
BEGIN
FOR i IN 1 .. array_upper(a, 1)
LOOP
IF a[i]+b[i] > 0 THEN
sum = sum + (a[i]-b[i])^2 / (a[i]+b[i]);
END IF;
END LOOP;

RETURN sum/2.0;
END;
\$_\$
LANGUAGE plpgsql;
```

Unfortunately, this is incredibly slow. Instead of the 15 minutes the Python script took, it took just under two hours to compute the pairwise distances on the 8 bin histograms.

When your interpreted code is slow, the solution is often to re-write compiled code and use that. I found some C code on Stack Overflow for writing array functions. The PostgreSQL interface isn’t exactly intuitive, but here’s the gist of the code (full code):

```#include <postgres.h>
#include <fmgr.h>
#include <utils/array.h>
#include <utils/lsyscache.h>

/* From intarray contrib header */
#define ARRPTR(x) ( (float8 *) ARR_DATA_PTR(x) )

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(chi_square_distance);
Datum chi_square_distance(PG_FUNCTION_ARGS);

Datum chi_square_distance(PG_FUNCTION_ARGS) {
ArrayType *a, *b;
float8 *da, *db;

float8 sum = 0.0;
int i, n;

da = ARRPTR(a);
db = ARRPTR(b);

// Generate the sums.
for (i = 0; i < n; i++) {
if (*da - *db) {
sum = sum + ((*da - *db) * (*da - *db) / (*da + *db));
}
da++;
db++;
}

sum = sum / 2.0;

PG_RETURN_FLOAT8(sum);
}
```

This takes 79 seconds to do all the distance calculates on 8 bin histograms. That kind of improvement is well worth the effort.

# Results

After all that, the results aren’t as good as I was hoping. For some photos, such as the photos I took while re-raising the bridge across the creek, sorting by the histogram distances does actually identify other photos taken of the same process. For example, these two photos are the closest to each other by 32 bin histogram distance:

But there are certain images, such as the middle image in the three below that are very close to many of the photos in the database, even though they’re really not all that similar. I think this is because images with a lot of black in them (or white) wind up being similar to each other because of the large areas without color. It may be that performing the same sort of analysis using the HSV color space, but restricting the histogram to regions with high saturation and high value, would yield results that make more sense.

sun, 18-mar-2012, 14:58

A week ago I set up a Tumblr site with the idea I’d post photos on there, hopefully at least once a day. Sort of a photo-based microblog. After a week of posting stuff I realized it would be really easy to set something like this up on my own site. It works like this:

• Email myself a photo where the subject line has a keyword in it that procmail recognizes, sending the email off to a Python script.
• The Python script processes the email, extracting the photo name and caption from the text portion of the email, rotates and resizes the photo, and saves it in a place accessable from my web server. The photo metadata (size, when the photo was taken, caption and path) is stored in a database.
• A Django app generates the web page from the data in the database.

There are a few tricky bits here. First is handling the rotation of the photos. At least with my phone, the image data is always stored in landscape format, but there’s an EXIF tag that indicates how the data should be rotated for display. So I read that tag and rotate appropriately, using the Python Imaging Library (PIL):

```import StringIO
import Image
import ExifTags

orientation_key = 274

if orientation_key in exif:
orientation = exif[orientation_key]
if orientation == 3:
image_data = image_data.rotate(180, expand = True)
elif orientation == 6:
image_data = image_data.rotate(270, expand = True)
elif orientation == 8:
image_data = image_data.rotate(90, expand = True)
```

For simplicity, I hard coded the orientation_key above, but it’s probably better to get the value from the ExifTags library. That can be done using this list comprehension:

```orientation_key = [key for key, value in \
ExifTags.TAGS.iteritems() if value == 'Orientation']
```

Resizing the image is relatively easy:

```(x, y) = image_data.size
if x > y:
if x > 600:
image_600 = image_data.resize(
(600, int(round(600 / float(x) * y))),
Image.ANTIALIAS
)
else:
image_600 = image_data
else:
if y > 600:
image_600 = image_data.resize(
(int(round(600 / float(y) * x)), 600),
Image.ANTIALIAS
)
else:
image_600 = image_data
```

And the whole thing is wrapped up in the code that parses the pieces of the email message:

```import email

msg = email.message_from_file(sys.stdin)
body = []
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
content_type = part.get_content_type()
if content_type == "image/jpeg":
image_data = Image.open(StringIO.StringIO(
))
elif content_type == "text/plain":
charset = get_charset(part, get_charset(msg))
text = unicode(part.get_payload(decode = True), charset, "replace")
body.append(text)

body = u"\n".join(body).strip()
```

The get_charset function is:

```def get_charset(message, default="ascii"):
"""Get the message charset"""

if message.get_content_charset():
return message.get_content_charset()

if message.get_charset():
return message.get_charset()

return default
```

Once these pieces are wrapped together, called via procmail, and integrated into Django, it looks like this: photolog. There’s also a link to it in the upper right sidebar of this blog if you’re viewing this page in a desktop web browser.

Meta Photolog Archives