To preface this post I'm going to say that what I'm looking to do is purely in the theorhetical stages at this moment so I don't have any code to share yet.

At it's basest essence what I'm attempting to do is compare Image A with Image B to determine if they are the same image.

The catch comes in the fact that Image B may be at a different scale or orientation from Image A (could be higher resolution or landscape vs portrait).

I've already determined that I am going to be breaking down the initial image(s) into base patterns (skeletons -> line ends) and comparing the patterns from there (the skeletons/line ends are going to be derived from pixel groupings and variants to try to give 'base' shape characteristics from the original). What I'm stuck on at that point is what the best method would be to determine 'similarity' between the 2 images.

In a perfect world the 2 images would be similar enough that straight comparison would be possible (for example in the image I'm attaching, square 1 would compare directly to another same sized same orientation square). In the not-so-perfect world there would be scale or orientation differences (square 1 to square 2,3 or 4). In either of these worlds a straight comparison between points with scale/orientation adjustments as needed would yield a true/false on the images matching.

However, we don't live in a perfect world and there's other factors to take into consideration such as difference in cropping and the like. What I'm trying to figure out is, once I've developed a base 'pattern' from an image (kind of like grabbing key elements from a fingerprint in CSI as an example) how do I match that pattern to another image that may not be cropped/rotated/sized the same as the original?

Again, I'm just talking 'in theory' here but as I said, the part I get stuck on is where the 2 images don't share the same dimensions/positioning and so a straight array comparison wouldn't fit the need.

Any thoughts? Sorry for the primitiveness of the attached images, threw it together in photoshop to try to illustrate better what I'm talking about.

Edit: Ruh Roh... I just realized I entered AD's domain *runs and hides* just kidding AD :twisted:

Recommended Answers

All 10 Replies

Well, I did a bit of image processing, so I know the basic theory. I can't explain it all but I can give some keywords and methods for you to look up.

First, the problem of resolution, cropping, geometric disturbance (rotation translation scaling etc.) is pretty trivial to solve really. What you need to do is re-sample the image with increased resolution and geometry correction. The re-sampling is done with a bi-linear interpolation (or better) of the surrounding pixels. Say you double the resolution, it means for each 2x2 pixels square you need to make a 4x4 square, so for each of the new pixels you figure out where the four closest original pixels are (four corners) and do a bi-linear interpolation to get the color (or gray-scale) of the new pixel. For the geometric correction, that's just a mapping from the position of a pixel in the original image to the position of the pixel in the new image. Usually the mapping is: constant terms = translation, linear terms (x_new = A*x_old) = scaling, bi-linear terms (like x_new = A*y_old) = rotation + scaling, ... and it goes on for higher order terms like correction of fish-eye effect, etc.

Then for the comparison, well normally people use cross-correlation to compare two images. The basic idea is pretty simple. If you have two images that are the same, if you multiply each pixel-value (color or gray-scale) from the same position together and add them all up, you get a really big number which is the maximum possible number you could get. So to compare two images, you compare how large the number is for IMAGE1 * IMAGE1 and IMAGE1 * IMAGE2, if they are close, it means IMAGE2 is almost the same as IMAGE1. Of course, it's more complicated than that in reality, look up cross-correlation as a starting point.

Finally, the feature detection that you talked about is essentially the same but smaller in scale. You take what they call a window, which just means a small square of pixels (like 3x3 or 8x8 or whatever) that has the feature you are looking for (like an edge or bright spot or whatever) and you move it all over the image while performing the cross-correlation and checking how good the result is. You record every point where the result was good enough (of course things are always fuzzy with image processing, you will never find a method that works 100% of the time and output true or false, as you said we live in the real world).

Final thought, I understand you want to do this yourself. But maybe you want to take a look at OpenCV (Open-source Computer Vision Library) it has all the methods you need for your problem and much much more.

commented: Nice post. +13

Thanks mike,

I'll definitely look into OpenCV as well. I was actually (again all theory here haven't started solid planning at all) thinking to work with Magick++ to deconstruct the images into base components I could use for comparison and work from there but perhaps an easier method exists in OpenCV :)

That VTK page looks like it might have some useful resources... I'll have to look into it in more detail, thanks again :)

Ok, so after some testing it would appear that ImageMagick would not be a good kernel to use, particularly on full-colour images.

At least, not in the way I had hoped to use it... the initial step of reducing an image to skeleton form doesn't work in non-greyscale images correctly and takes FOREVER to accomplish on colour images.

Guess I'll have to look into the other suggested methods in more detail.

In the end I'm trying to find a way to break down an image into a recognizeable 'pattern' form that would be unique to each image so if anyone else has any suggestions I'd be happy to look into those as well. The pattern needs to be representative visually of some aspect of the image though as patterns derived from the 'whole' would eliminate the ability to compare with differently cropped images with same content.

Before I mark this solved as a matter of housekeeping, does anyone else have any suggestions to add?

My only suggestion is to not get frustrated - this is a very difficult problem and it is definitely not "solved". People still devise clever tricks tailored to their particular data sets, so you will certainly not be able to find any "one size fits all" algorithm to do something like this.

Good luck!

so you will certainly not be able to find any "one size fits all" algorithm to do something like this.

Oh I know I won't find a 'one size fits all' but I'm still collecting 'ideas' at this stage is all :)

As for not getting 'solved', I'm only 'solving' the thread as a matter of housekeeping (probably by tomorrow) so people who did toss in their helpful advice at least get credit for it :twisted:

Haha what I meant was not "mark as solved" on DaniWeb, but a "solved problem" in the research field (i.e. everyone agrees no further work is necessary).

Do you have an idea how to compare images using QT?

im no good at programming but i'm trying to create something that compares images from a query image to the images in the database. How do i start this?

i was planning to use MS Access to store images, but do you have any other suggestions?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.