Hello everyone. I have a problem with my checksum and logic. I wrote a program that downloads and read bitmaps and try to generate a checksum for each one. The checksum is used as the name to save the bitmap as so that I don't save the same bitmap every time. The problem is that the RGB values of the images change ever so slighlty every time (Un-noticed by my eyes). Because of this, I decided to use the alpha values to generate unique values inorder to not save the same images over and over. Thing is, with this method.. Two images with the same amount of alpha values will have the same checksum and that's bad for me.

My checksum is as follows:

DWORD CheckSum(const void* Data, size_t Width, size_t Height, uint32_t BitsPerPixel)
{
    DWORD CheckSum = 0;

    const unsigned char* BuffPos = static_cast<const unsigned char*>(Data);
    Height = (Height < 0 ? -Height : Height);

    for (size_t I = 12; I < Height; I++)  //Has to start at 12.
    {
        for (size_t J = 0; J < Width; J++)
        {
            BuffPos+=3;
            CheckSum += (BitsPerPixel > 24 ? *(BuffPos++) : 0);
        }
        if (BitsPerPixel == 24)
            BuffPos += Width % 4;
    }

    return CheckSum;
}

Is there anyway I can still generate unique checksums but at the same time keep it static per image so that I don't end up saving and downloading the same image more than once? I was thinking that I'd have to find the position of each alpha pixel in the image and add those but I cannot figure out how to find the position of them.

Edited 4 Years Ago by triumphost

I don't understand. The checksum lets you know if the pix are identical.
1) Identical pix have the same checksum.
2) Different pix have different checksums.
3) Different alpha values produce different pix, therefore different checksums.
4) Same alpha values on different pix do not produce same checksum (pix are different)

So I don't understand the problem at all. Unless it that the same visual pictures have different bitmaps because they are in fact different (larger, darker, watermark, etc). You can't use a checksum to test identical pictures that are internally different.

WaltP's remark 2) isn't true. Because a very simple checksum formula is used (just add up the values) different pictures with the same values in it will produce the same checksum. Just like 3+8 = 8+3 (addition is commutative).
If you want guaranteed different checksums when the pictures are different, use a cryptographic hash function, e.g. sha-160 will do for most purposes.

Edited 4 Years Ago by Kwetal

WaltP's remark 2) isn't true.

Yes it is.

Because a very simple checksum formula is used (just add up the values) different pictures with the same values in it will produce the same checksum.

If different pictures have all the same values, it's the same picture. Can you provide any bitmaps that are different pictures but have the same values inside? I'd really like to see this. Now, if you're complaining about the fact that addition is commutative, so what?

There is an inherent problem with checksums to begin with. There are only so many checksums available based on storage size of the checksum.

Let's take a simple example. You have a checksum that's one byte. There are at most 256 different checksum values. So, if you have 257 pictures, at least 2 of them will have the same checksum.

When you use a 32 byte word, 4,294,967,296 different checksums are possible. Less of a chance for a small sample of pix (but possible). But with 4,294,967,297 pix at least 2 will have the same checksum.

The above is true for any simple checksum and any cryptographic hash function.

This article has been dead for over six months. Start a new discussion instead.