Text Encryption/Decryption with XOR (Python) Page 2

kylealanhale 0 Newbie Poster

14 Years Ago

You would lose speed, though, due to the function calls, as we saw from the tests on Nezachem's version. And he still had to zip the results together, so the whole thing was loaded into memory, anyway. There might be a way to combine the best of both worlds, though.

TrustyTony 888 ex-Moderator

14 Years Ago

> That ends the debate.
Well, as programmers we must realize that the debate never ends. Besides, neither approach looks pythonic enough to me. Since you already have the performance test set up, could you add the following?
def loop(text):
    def looper(t):
        while True:
            for c in t:
                yield c
    return looper(text)

def crypt(text, passwd):
    crypto = []
    for (t, p) in zip(text, loop(passwd)):
        crypto.append(chr(ord(t) ^ ord(p)))
    return ''.join(crypto)

Timing:

crypt3 took 1272 ms.
File length: 89729 

         897306 function calls in 2.529 CPU seconds

This code is bit funny, my version of DIY looping (cycle in itertools):

def crypt8(text, passwd):
    def loopord(text):
        while True:
            for c in text:
                yield ord(c)
    return ''.join([chr(ord(t) ^ p) for (t, p) in zip(text, loopord(passwd))])

crypt8 took 1019 ms.
File length: 89729 

         717846 function calls in 1.993 CPU seconds

Itertools implementation did little better still:

crypt5 took 986 ms.
File length: 89729 


         717844 function calls in 1.976 CPU seconds

Edited 14 Years Ago by TrustyTony because: correct itertools time

TrustyTony 888 ex-Moderator

14 Years Ago

I have attached the updated testing script if you want to see it.

I got only announcement print from the first test to terminal, so I put the messages inside the test and redirected to sys.stderr

import sys
import cProfile as profile
from time import clock

__all__ = ['Tester', 'do_tests']

class Tester(object):
    '''Decorator for all tested function'''
    def __init__(self,fn):
        self.name = fn.__name__
        self.fn = fn
    def __call__(self, *args):
        t = clock()
        ret = self.fn(*args)
        print >>sys.stderr,"Testing", self.fn.__name__
        tt = clock()-t
        try:
            assert self.fn(ret,args[1]) == args[0]
        except AssertionError:
            print self.fn.__name__, "failed the decrypt test."
            return

        print self.fn.__name__, "took %i ms." % (tt * 1000)
        print "File length:",len(ret),"\n\n"


def do_tests(file_name, password, tests):
    '''tests must be a list with functions to test in it (list items are function type)'''
    f = open(file_name,"r")
    ftext = f.read()
    f.close()
    for test in tests:
        sys.stdout = open("{0}_{1}.txt".format(test.name,"results"),"w")
        profile.runctx("tests[{funcindex}]({filetext},'{pw}')".format(funcindex=tests.index(test),
                                                              filetext="'''{0}'''".format(ftext),
                                                              pw=password),globals(),locals())
        sys.stdout = sys.__stdout__

Edited 14 Years Ago by TrustyTony because: stderr

jcao219 18 Posting Pro in Training

14 Years Ago

I got only announcement print from the first test to terminal

That means you are using IDLE or something, so the regular stdout isn't a console window.

TrustyTony 888 ex-Moderator

14 Years Ago

You are right the original works if run directly and like normally the speed of execution is little faster, not so much different than some other occasions though.

Here would be nice piece to test actually the announcement of today: Assembly code module. Are you handy with SSE instructions?
(http://www.tahir007.com/?view=examples)

I only know very superficially x86 assembly (to know it is a mess), I knew better Z80 and ARM, now maybe those already rusty.

Of course that case better to improve AES functions (of which current state I do not know anything) or something.

By the way for these xor functions effect of psyco looks minimal.

Edited 14 Years Ago by TrustyTony because: ASM dialects

jcao219 18 Posting Pro in Training

14 Years Ago

You are right the original works if run directly and like normally the speed of execution is little faster, not so much different than some other occasions though.
Here would be nice piece to test actually the announcement of today: Assembly code module. Are you handy with SSE instructions?
(http://www.tahir007.com/?view=examples)
Of course that case better to improve AES functions (of which current state I do not know anything) or something.
By the way for these xor functions effect of psyco looks minimal.

Eh.. I barely know anything about asm.

kylealanhale 0 Newbie Poster

14 Years Ago

Just in case any future reader of this thread wants a slightly more readable version of tonyjv's winning XOR crypt function:

def crypt(text, password):
    password_length = len(password)
    password = [ord(character) for character in password]
    text = [ord(character) ^ password[index % password_length] for (index, character) in enumerate(text)]
    
    return ''.join([chr(character_code) for character_code in text])

And, for fun:

def crypt(t, p):
    l = len(p)
    p = [ord(c) for c in p]
    t = [ord(c) ^ p[i % l] for (i, c) in enumerate(t)]
    return ''.join([chr(c) for c in t])

TrustyTony 888 ex-Moderator

14 Years Ago

With my modified version of test function which prints running time also to terminal, I rerun the test, renaming the slow function to crypt1. So this last crypt is crypt.

First I thought your code was little slower than crypt6 and crypt7 in my file, but after moving your function not first one tested, the timing changed. So looks this timing function is not correct for the first tested function.

Interestingly cryptic version crpt got worse time!

I got following results:

Testing crypt1 took 11281 ms.
Testing crypt2 took 716 ms.
Testing crypt3 took 1195 ms.
Testing crypt4 took 1161 ms.
Testing crypt5 took 953 ms.
Testing crypt6 took 488 ms.
Testing crypt7 took 483 ms.
Testing crypt took 477 ms.
Testing crpt took 491 ms.
Testing crypt_loop took 958 ms.
Testing crypt_oneliner took 737 ms.
Enter

Good coding!

Edited 14 Years Ago by TrustyTony because: n/a

Tahir007 10 Newbie Poster

14 Years Ago

You are right the original works if run directly and like normally the speed of execution is little faster, not so much different than some other occasions though.
Here would be nice piece to test actually the announcement of today: Assembly code module. Are you handy with SSE instructions?
(http://www.tahir007.com/?view=examples)
I only know very superficially x86 assembly (to know it is a mess), I knew better Z80 and ARM, now maybe those already rusty.
Of course that case better to improve AES functions (of which current state I do not know anything) or something.
By the way for these xor functions effect of psyco looks minimal.

Here is trivial implementation of crypt in Tdasm. :-)
Try crypt different file sizes to see difference in speed. :-)
On my machine.
600 KB pdf file - ~3-4 ms Tdasm implementation
600 KB pdf file - ~240 ms you implementation

Here is source:

<pre>
from tdasm import runtime
import array
import timeit

CRYPT_ASM = """
#DATA

uint32 len_pass, addr_pass
uint32 len_text, addr_text

#CODE

xor eax, eax
xor ebx, ebx ; clear eax and ebx registers
mov ecx, dword [len_text]

loop1:
dec ecx
mov eax, ecx
mov edx, 0
div dword [len_pass]
mov eax, dword [addr_pass]
mov al, byte [eax + edx] ; load byte from password
mov ebx, dword [addr_text]
xor byte [ebx + ecx], al

cmp ecx, 0
jnz loop1

#END
"""

r = runtime.Runtime()
r.create("crypt", CRYPT_ASM)

def crypt_asm(text, password):
ds = r.get_datasection("crypt")
pass_arr = array.array("c", password)
text_arr = array.array("c", text)

address, length = pass_arr.buffer_info()
ds["len_pass"] = length
ds["addr_pass"] = address
address, length = text_arr.buffer_info()
ds["len_text"] = length
ds["addr_text"] = address
r.run("crypt")
return text_arr.tostring()

def crypt(text, password):
password_length = len(password)
password = [ord(character) for character in password]
text = [ord(character) ^ password[index % password_length] for (index, character) in enumerate(text)]
return ''.join([chr(character_code) for character_code in text])

if __name__ == "__main__":
pa = "123456"
text = "ovo je samo za testiranje"

fi = open("test1.pdf", "rb")
text = fi.read()

t = timeit.Timer(lambda : crypt(text, pa))
print "time", t.timeit(1)
</pre>

Edited 14 Years Ago by Tahir007 because: n/a

Beat_Slayer commented: Nice code implementation +1

TrustyTony 888 ex-Moderator

14 Years Ago

Here is trivial implementation of crypt in Tdasm. :-)
Try crypt different file sizes to see difference in speed. :-)
On my machine.
600 KB pdf file - ~3-4 ms Tdasm implementation
600 KB pdf file - ~240 ms you implementation

Here is source:

Looks like you have more practice in ASM than with (CODE) tags (even mayby assembler is not there) ;)

Your code with proper tags (good show of for your module like I said, isn't it):

from tdasm import runtime
import array
import timeit

CRYPT_ASM = """
    #DATA

    uint32 len_pass, addr_pass
    uint32 len_text, addr_text

    #CODE

    xor eax, eax
    xor ebx, ebx ; clear eax and ebx registers
    mov ecx, dword [len_text]


    loop1:
    dec ecx
    mov eax, ecx
    mov edx, 0
    div dword [len_pass]
    mov eax, dword [addr_pass]
    mov al, byte [eax + edx] ; load byte from password 
    mov ebx, dword [addr_text]
    xor byte [ebx + ecx], al 

    cmp ecx, 0
    jnz loop1

    #END
    """

r = runtime.Runtime()
r.create("crypt", CRYPT_ASM)

def crypt_asm(text, password):
    ds = r.get_datasection("crypt")
    pass_arr = array.array("c", password)
    text_arr = array.array("c", text)

    address, length = pass_arr.buffer_info()
    ds["len_pass"] = length
    ds["addr_pass"] =  address
    address, length = text_arr.buffer_info()
    ds["len_text"] = length
    ds["addr_text"] = address
    r.run("crypt")
    return text_arr.tostring()

def crypt(text, password):
    password_length = len(password)
    password = [ord(character) for character in password]
    text = [ord(character) ^ password[index % password_length] for (index, character) in enumerate(text)]
    return ''.join([chr(character_code) for character_code in text])

if __name__ == "__main__":
    pa = "123456"
    text = "ovo je samo za testiranje"


    fi = open("test1.pdf", "rb")
    text = fi.read()

    t = timeit.Timer(lambda : crypt(text, pa))
    print "time", t.timeit(1)

Edited 12 Years Ago by mike_2000_17 because: Fixed formatting

Tahir007 10 Newbie Poster

14 Years Ago

Looks like you have more practice in ASM than with [CODE] tags (even mayby assembler is not there) ;)

Your code with proper tags (good show of for your module like I said, isn't it):

You are right I know better ASM than Code tag. Its good show for my module. :-)
Now maybe i implement AES and SHA1 for the show. :-)

Edited 12 Years Ago by Reverend Jim because: Fixed formatting

TrustyTony 888 ex-Moderator

14 Years Ago

Just to add results for selected versions and ASM version (crypt is the last posted version):

crypt2 took 750 ms.
crypt6 took 514 ms.
crypt7 took 505 ms.
crypt_loop took 1026 ms.
crypt_oneliner took 781 ms.
crypt took 507 ms.
crpt took 497 ms.
crypt_asm took 2 ms.

Edited 14 Years Ago by TrustyTony because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

Wow! That's some pretty good asm coding.
All I could understand was the

xor eax, eax
xor ebx, ebx

I'm impressed how fast it is.
I wonder how something in C/C++ would compare.

Edited 14 Years Ago by jcao219 because: n/a

TrustyTony 888 ex-Moderator

14 Years Ago

That is quite well, Jcao219, because those xors is hackish version of clearing register, without need to use constant 0. (x xor x == 0)

>>> x=23423
>>> x ^ x
0
>>> x=192423472389
>>> x ^ x
0L

I believe the ASM is not very optimized one as it does not load full register and xor 4 bytes or 8 bytes at time, but one byte at time ( xor byte [ebx + ecx], al ). It is good basic version though.

I would like to prepare some functions to prepare ASM instructions little more readable way

Edited 14 Years Ago by TrustyTony because: n/a

Tahir007 10 Newbie Poster

14 Years Ago

#DATA

uint32 len_pass  ;length in bytes of password
uint32 addr_pass ; address of first byte where password begins 
uint32 len_text  ; length in bytes of password
uint32 addr_text ; address of first byte where text begins

    #CODE

    xor eax, eax  ; eax = 0
    xor ebx, ebx ;  ebx = 0
    mov ecx, dword [len_text] ; ecx = number of character to crypt
    
	; i crypt one by  one character because of password, password can be 3 or 5 or ...
	; character long and that why loop by one character
	; if for passwrod we use some kind of padding so that 
	; password can be 4, 8, 12, ... bytes long than it will be very 
	; easy to implement MMX, SSE version that will be much faster
    loop1:            ; we crypt backwards from last character to first
    dec ecx           ; ecx = ecx - 1  array is from 0:n-1 thats why we first decrement index
    mov eax, ecx      ; we put current index of character in eax
    mov edx, 0        ; this is because of div, we could also place xor edx, edx :-)   
    div dword [len_pass]       ; edx = edx:eax % length_if_password
    mov eax, dword [addr_pass] ; eax = address_of_first_byte_in_password
    mov al, byte [eax + edx]   ; al = password [eax + edx], edx = index in passwrod array 
    mov ebx, dword [addr_text] ; ebx = address of first byte of text to crypt
    xor byte [ebx + ecx], al   ; text[ebx + ecx] ecx = current index od byte to crypt

    cmp ecx, 0  ; test if index in array of character reach zero to exit loop
    jnz loop1

    #END

Here is little more comments in assembly code.

kylealanhale 0 Newbie Poster

14 Years Ago

Very impressive!

TrustyTony 888 ex-Moderator

14 Years Ago

Found one short manual I found about x86 is http://www.acm.uiuc.edu/sigwin/old/workshops/winasmtut.pdf

Maybe time to refresh memories from around 1986....

jcao219 18 Posting Pro in Training

14 Years Ago

Very interesting! But I have no use for ASM right now.
I should learn C before learning that stuff.

TrustyTony 888 ex-Moderator

14 Years Ago

I did C version, C++ I do not know so well.

Unfortunately I had no energy to restudy C memory allocation sweetness. So this is command line program file to file.

Everything looks working now that I fixed the obvious thing that read in character must be declared as int, not char.

D:\test>python xorcryptp.py "Cold Roses" text_100kb.txt textp.txt
Running program took 131 ms

D:\test>xorcrypt "Cold Roses" text_100kb.txt textp.txt

102071 chars.
The total time taken by the system is: 15 ms.

D:\test>

I did version of main which took the same parameters and read file in and wrote it out.

So, because file IO is so slow I added using psyco module and got:

D:\test>python xorcryptp.py "Cold Roses" text_100kb.txt textp.txt
Running program took 56 ms

This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with zip files.

test.zip (6.22 KB)

Edited 14 Years Ago by TrustyTony because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

I see.
I might make a C# version sometime, to test .NET's speed.

jcao219 18 Posting Pro in Training

14 Years Ago

I've created a C# version.

Results:
100kb file took 3ms,
1mb file took 28ms.

Edited 14 Years Ago by jcao219 because: n/a

Tahir007 10 Newbie Poster

14 Years Ago

I see that you are trying to achieve better times. I couldn't resist to write
another version of crypt.
Here is version that is even simpler than before but twice as fast. :-)

CRYPT_ASM2 = """
    #DATA

    uint32 len_pass, addr_pass
    uint32 len_text, addr_text

    #CODE

    mov edi, dword [addr_text]  ; edi = point to first character in text
    mov edx, dword [len_text]   ; edx = lenght of text

    loop2:
    mov ecx, dword [len_pass]  ; ecx = length of password
    mov esi, dword [addr_pass] ; esi = point to first character in password

    loop1:
    mov al, byte [esi]  ; al = *esi   - for C programmers  
    inc esi             ; esi++      - increment pointer for next password char.
    xor byte [edi], al  ; *edi ^= al

    inc edi             ; edi++  we just increment pointers
    dec edx             ; check if we crypt all text
    cmp edx, 0          ; edx was the length of text
    jz end1             ; if all text is crypt we finish
    dec ecx             ; check if loop through whole password
    jne loop1           ; if we are not process next character
    jmp loop2           ; if we are process password form begining

    end1:
    #END
    """

jcao219 18 Posting Pro in Training

14 Years Ago

How fast, exactly?

By the way, tonyjv, I think you forgot to close the infile and outfile in your C program.

Tahir007 10 Newbie Poster

14 Years Ago

On my machine I achieve these times.
File ~1 MB - 2.3 ms
File ~150 KB - 0.39 ms
File ~26MB - 75 ms

jcao219 18 Posting Pro in Training

14 Years Ago

That's very good.

Assembly is definitely the fastest,
and then well-written C/C++,
and then C#,
and probably Java is next,
and finally Python.

Edited 14 Years Ago by jcao219 because: n/a

TrustyTony 888 ex-Moderator

14 Years Ago

How fast, exactly?
By the way, tonyjv, I think you forgot to close the infile and outfile in your C program.

Thanks, almost never use them in Python. So I put in the end of program:

fclose(infile);
  fclose(outfile);
  return 0;

The test case was too small to get measurement of the new version, so with one 3MB+ file the tests (only one Python version for obvious reason):

crypt_asm took 118 ms.
File length: 3919433

crypt_asm2 took 30 ms.
File length: 3919433

crypt took 22352 ms.
File length: 3919433

My C code file to file (not same as above, they have not file IO time)

D:\test\XorCrypting_SpeedTests>mycopy "cold roses" estonian.txt est.txt

3919433 chars.
The total time taken by the system is: 656 ms.

D:\test\XorCrypting_SpeedTests>xorcrypt "cold roses" estonian.txt est.txt

3919433 chars.
The total time taken by the system is: 765 ms.

I did version that only does copying without xor, difference between them is 109 ms.

Edited 14 Years Ago by TrustyTony because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

So we are up to 3 mb file crypto speed testing?
I'll do some more tomorrow.

Edited 14 Years Ago by jcao219 because: n/a

vegaseat 1,735 DaniWeb's Hypocrite

14 Years Ago

This method is great for basic cryptography in Python,
however advanced and secure encryptions such as AES offer the best degree of security.
For those of you interested in that, PyCrypto is for you.

To be honest, I was more interested in this aspect of the project:

The password is looped against the file, but you can get tricky and spell forward then backward, odd/even, or every odd character twice and every even character once. This will make it harder for grandma to decipher your secret files.

It doesn't take much of a genius to recommend a compiled language, if you want to go for speed alone. Actually, the original program was written in C with some inline assembler thrown in. Later a Delphi version gave the C version a run for the money.

Edited 14 Years Ago by vegaseat because: n/a

TrustyTony 888 ex-Moderator

14 Years Ago

To be honest, I was more interested in this aspect of the project:

It doesn't take much of a genius to recommend a compiled language, if you want to go for speed alone. Actually, the original program was written in C with some inline assembler thrown in. Later a Delphi version gave the C version a run for the money.

OK, fine. Go ahead and use this nice crypto, as those are easy to m for your files but please crypt this small file attached with same password and function and save as tonyjv.txt.

Don't need to look it with text editor after crypting, just mail it to me :twisted:

key.txt (0.5 KB)

jcao219 commented: Clever +1

jcao219 18 Posting Pro in Training

14 Years Ago

What surprises me is the .NET code seems to be faster.
(Measured from the creation of a stream for the input file,
to the closing of the output file stream after writing)

Size: 3928779
Elapsed milliseconds: 79

Edited 14 Years Ago by jcao219 because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Text Encryption/Decryption with XOR (Python) - Page 2