Hello all, I'm a Jr. in college and I haven't studied machine learning yet to any extent. I would like to at least know how to specify this project I have in mind though.

Basically I'd like to have a number of input images to "train" the software with, and then use the software to recognize new images that contain a rectangular region similar to the inputs. I've heard a little about this type of thing involving artificial neural networks, but I would like to know what this type of task is called and if there is a suitable C++ library that may help with it's development.

A quick google search returned these results:

Any information you may have would be helpful, I find this to be a very interesting problem to solve and I just wish I knew more about it.

You might want to take a look at quad-trees.

commented: Well there's some food for thought. +8


Well my first question is:
It says Q-trees "decompose space into adaptable cells", how are the cells "adaptable" what does it mean by that?

It seems that structure's usage in "Image representation" (based on the example) is to recursively classify rectangular regions of the image based on a color (grey for mixed, black for solid black, etc.), which is simple for the example because it is black and white. I suppose one could apply a "weight" to that consisting of a floating point or integral value.

Originally I did intend my training data to be greyscale (what luck!) so the rectangular region's average value from the color white could be used as it's "weight"?

Now with this simple neural network that we're building here (gotta love Daniweb) could we could average the values of many sets of input data to produce median values with which to use in recognition?

I think this could be used to solve my problem excellently, specifically when the section of image to be matched varies in size.

But then once we have these images decomposed and represented internally by the computer, how do we do proper matching? I mean is it going to work properly if I make a match between Depth 5 of my training input image with depth 1005 of my test image? Obviously this won't work properly when it comes down to a group of 4 pixels versus a rather large section of the test image.

To put it differently, how can I determine if I want to match a region of my test image that is 1 inch across with a region of my training input data that is 5 inches across?

Would not a weight value be required to account for the distance between matching depths (depths in the data structure)? I have a bad feeling if this isn't incorporated in some way there will be false positives generated.

Another point of interest is, how do I determine a value between -1.0 and 1.0 is acceptable for matching a depth?

Should I be doing this, or should I be comparing the value of each node in the structure?

As it stands if I'm not entirely mistaken and this would actually work, then it would still need at the very least a user to specify a maximum depth for matching, like a group of 128x128 pixels is the minimum size matchable with the training input data.

Now I'm not sure if this process would ignore matches around the edges of the rectangles, I think it would and another approach might be more suitable. Although this method does have a lot of benefit in the performance department, another method might be more preferable.

I'm sure there are a lot of things here that I have missed but please try to bear with me and my mild retardation.