I wrote my own sha implementations but i cant get them to read files correctly. i know to use

open(file,'rb').read()

as the input into the algorithm, but

import hashlib
hashlib.sha1(open(file,'rb').read()).hexdigest()

and my code

sha_1(open(file,'rb').read())

return two different hashes.
any idea why? i know that a normal string will have the correct output, so why would the string of a file be any different?

You said that when you hash a string (not a file) that your algorithm and the sha1 library come back with the same results? That's awesome! You're definitely close.

What length of string are we talking here? I seem to remember thinking that SHA1 breaks files down into smaller blocks when it hashes them. Perhaps you should write a loop that will keep making a string longer and longer and report back if your hash is different from SHA1.

any length string will return the correct data. i wrote it according to NIST's paper, which pads the input to suit SHA-1. and yet, it wont run properly for files, even after i change the program to read files

should i post the code? too bad i cant just put my site's url so that people can just download it

any length string will return the correct data. i wrote it according to NIST's paper, which pads the input to suit SHA-1. and yet, it wont run properly for files, even after i change the program to read files

should i post the code? too bad i cant just put my site's url so that people can just download it

When you reply you should find an option to put in a link. You can also attach a file to your post so that you don't have to make a gigantic post with all of the code in it.

On my system (which is running Ubuntu linux) your code is working as expected for small files.

From the command line I created a small file using the touch command and put a small amount of data in the file.

touch myfile
echo "what" > myfile

Then I used the sha1sum command, which is included with most linux distributions to find out what the hash of the file should be according to a known good application.

`kevin@kevin-laptop:~/Downloads$ sha1sum myfile`
`c4f606d775c7f99d172548df49a2109a656a7b11  myfile`

Then I opened up a terminal window and ran your code.

>>> import sha1

>>> infile = open('myfile','rb')
>>> print sha1.SHA_1(infile.read())
    c4f606d775c7f99d172548df49a2109a656a7b11

When I ran the code against a larger file the results were not the same. This inclines me to believe there might be something wrong with your implementation.

Edited 3 Years Ago by Reverend Jim: Fixed formatting

any idea what it is? i got rid of the part that would have messed up the calculation if the input size was a factor of 448 or 512. i cant find a good reason for why its messing up

Unfortunately I don't understand the encryption algorithm well enough to figure out where you code is going wrong.

Have you considered looking at the source code for some other implementation of SHA1?

I took a look at the hashlib.py file (which is where python implements sha1) and it looks to me like it is just calling the C implementation. Then I looked around and found the source code for the C and C++ implementation. It wont be as easy to understand, but it might help you to identify if there is an error in your code.
http://www.packetizer.com/security/sha1/

This article has been dead for over six months. Start a new discussion instead.