I'm reading in a file and sending the data (once encrypted) to a dictionary, with a hash of the data before and after encryption. I then pickle the dictionary but find the file size is massive compared to the source file size. If I write the encrypted data straight to a file the size is identical to the source. Any idea why my pickled file is so large?

#Encrypt data and get hashes        
    def encryptAndExportFile(self, key, inFile, outFile):
        
        openInFile = open(inFile,"rb")
        inFileSize = os.path.getsize(inFile)
        inFileData = openInFile.readlines()
        openInFile.close()
        
        """ initialise cipher """
        
        cipher = AES.new(key, AES.MODE_CFB)
        
        """ initialise MD5 """
        
        m = hashlib.md5() #hash
        h = hashlib.md5() #hash of encrypted dataq

        encryptedData = []
        
        for data in inFileData:
            
            m.update(data) 
            encData = cipher.encrypt(data)
            h.update(encData)
            encryptedData.append(encData)
            

        hashResult = m.digest()
        encHashResult = h.digest()
        
        return hashResult, encryptedData, encHashResult
def storeEncryptedObject(self, obj, path):
        
        outFile = open(path, 'wb')
        pickle.dump(obj, outFile)
        outFile.close()

Recommended Answers

All 2 Replies

Using protocol 2 greatly improved file size, which only increases by 5% in some cases.

def storeEncryptedObject(self, obj, path):
        
        outFile = open(path, 'wb')
        pickle.dump(obj, outFile, protocol = 2)
        outFile.close()

Thanks for letting the rest of us know!
Most folks use protocol=-1 as the highest possible protocol.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.