We're a community of 1076K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,075,831 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Save Images by Checksums

Hello everyone. I have a problem with my checksum and logic. I wrote a program that downloads and read bitmaps and try to generate a checksum for each one. The checksum is used as the name to save the bitmap as so that I don't save the same bitmap every time. The problem is that the RGB values of the images change ever so slighlty every time (Un-noticed by my eyes). Because of this, I decided to use the alpha values to generate unique values inorder to not save the same images over and over. Thing is, with this method.. Two images with the same amount of alpha values will have the same checksum and that's bad for me.

My checksum is as follows:

DWORD CheckSum(const void* Data, size_t Width, size_t Height, uint32_t BitsPerPixel)
{
    DWORD CheckSum = 0;

    const unsigned char* BuffPos = static_cast<const unsigned char*>(Data);
    Height = (Height < 0 ? -Height : Height);

    for (size_t I = 12; I < Height; I++)  //Has to start at 12.
    {
        for (size_t J = 0; J < Width; J++)
        {
            BuffPos+=3;
            CheckSum += (BitsPerPixel > 24 ? *(BuffPos++) : 0);
        }
        if (BitsPerPixel == 24)
            BuffPos += Width % 4;
    }

    return CheckSum;
}

Is there anyway I can still generate unique checksums but at the same time keep it static per image so that I don't end up saving and downloading the same image more than once? I was thinking that I'd have to find the position of each alpha pixel in the image and add those but I cannot figure out how to find the position of them.

3
Contributors
3
Replies
5 Hours
Discussion Span
8 Months Ago
Last Updated
4
Views
triumphost
Practically a Master Poster
625 posts since Oct 2009
Reputation Points: 59
Solved Threads: 55
Skill Endorsements: 1

I don't understand. The checksum lets you know if the pix are identical.
1) Identical pix have the same checksum.
2) Different pix have different checksums.
3) Different alpha values produce different pix, therefore different checksums.
4) Same alpha values on different pix do not produce same checksum (pix are different)

So I don't understand the problem at all. Unless it that the same visual pictures have different bitmaps because they are in fact different (larger, darker, watermark, etc). You can't use a checksum to test identical pictures that are internally different.

WaltP
Posting Sage w/ dash of thyme
Team Colleague
11,404 posts since May 2006
Reputation Points: 3,421
Solved Threads: 1,055
Skill Endorsements: 37

WaltP's remark 2) isn't true. Because a very simple checksum formula is used (just add up the values) different pictures with the same values in it will produce the same checksum. Just like 3+8 = 8+3 (addition is commutative).
If you want guaranteed different checksums when the pictures are different, use a cryptographic hash function, e.g. sha-160 will do for most purposes.

Kwetal
Newbie Poster
14 posts since Oct 2010
Reputation Points: 26
Solved Threads: 1
Skill Endorsements: 0

WaltP's remark 2) isn't true.

Yes it is.

Because a very simple checksum formula is used (just add up the values) different pictures with the same values in it will produce the same checksum.

If different pictures have all the same values, it's the same picture. Can you provide any bitmaps that are different pictures but have the same values inside? I'd really like to see this. Now, if you're complaining about the fact that addition is commutative, so what?

There is an inherent problem with checksums to begin with. There are only so many checksums available based on storage size of the checksum.

Let's take a simple example. You have a checksum that's one byte. There are at most 256 different checksum values. So, if you have 257 pictures, at least 2 of them will have the same checksum.

When you use a 32 byte word, 4,294,967,296 different checksums are possible. Less of a chance for a small sample of pix (but possible). But with 4,294,967,297 pix at least 2 will have the same checksum.

The above is true for any simple checksum and any cryptographic hash function.

WaltP
Posting Sage w/ dash of thyme
Team Colleague
11,404 posts since May 2006
Reputation Points: 3,421
Solved Threads: 1,055
Skill Endorsements: 37

This article has been dead for over three months: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
 
© 2013 DaniWeb® LLC
Page rendered in 0.0617 seconds using 2.7MB