A question about SCASB, using it in decoding.

Question

Ninjikiran 0 Newbie Poster

16 Years Ago

Hmm,I understand by using repne scasb, pointing DI to the base64 string is supposed to search the string for the character that is in the input but not much of how to use it for decoding purposes.
Anyway I been looking at the idea that, using the same example Man and same ideology as encoding we have

TWFu
010011 | 010110 | 000101 | 101110
19 22 5 46

edit: A side note, to clear up my text below. as we see above the 6 bit T = 010011 and W = 010110. What I was trying to do was merge 010011 and 01 using those shifts.

These 6 bit numbers convert according to my base64string whilest encoding. Now for decoding I was thinking of just stealing the first 2 bits of the next input using left and right shifts like how I did in encoding but that does not seem to work at all. No matter how hard I try. I also checked to see if there was an addition or subtraction that was similar by trying different inputs but they were all erratic.

Now I see that there might be an important to using repne scasb but how would I implement it to convert back to the normal ascii system? thus converting TWFu to Man, since I obviously dont want to use the base64 table to convert TWFu to Man since the unencoded file is not supposed to be base64.

I looked up scasb and I am a little confused as to its function also, I get the basic idea of scanning a string pointed to by DI, CX times but the actual purpose, its output is very flakey to me that no information found online o nthe command could seem to clear up for me.

I am continuing to work on it as I type this post but hit another block

assembly

2 Contributors
9 Replies
155 Views
2 Days Discussion Span
Latest Post 16 Years Ago Latest Post by Ninjikiran

All 9 Replies

Duoas 1,025 Postaholic

16 Years Ago

Originally, you had an integer, which we used as an index into a table of ASCII values. Basically, we said: output_char = ascii_values[ input_index ]; Now you want to go the other way --that is, convert the ASCII value into an index into the table. That's where scasb comes in. Put the length of your table in CX, the value you are looking for in AL (I think), and execute REP SCASB. When done, subtract the address of the beginning of the table from DI. Now DI == the original index into the table, or one too large if the character was not found in the table:

DX = DI - OFFSET ascii_values
if DX == 64 then not found
else DX is the original input value

Bit shift DX into the right place in the output and add it in. After four characters are converted into indices, and shifted into the 3-byte array, you can then output it.

Hope this helps.

Duoas 1,025 Postaholic

16 Years Ago

I don't understand what you mean about spaces. Are you saying that spaces in your original file are producing spaces in your encoded file?

The encoded file will naturally contain spaces, but these are not part of the data, and should be ignored when decoding.

Duoas 1,025 Postaholic

16 Years Ago

Well, I've been playing with it and I can't match your exact description. I presume you are using MIME encoding.

To preserve spacing, I'll continue in a code block:

"Man A" should encode as "TWFuIEE="
"Man   A" (with three spaces) encodes as "TWFuICAgQQ==".
Both decode for me correctly.

Now, if you are getting the first to encode as the second, then there is an error in your encoding algorithm.

If you are getting the second to decode with only two spaces, then there is an error in your decoding algorithm.

If you haven't figured out what it is yet, please post it and I'll take a look at it.

Hope this helps.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Ninjikiran 0 Newbie Poster · Answer 1 · 2007-12-13T09:26:13+00:00

Well I made a base64 table for my encoding program, would I need to make an Ascii table for my decoding? That is basically my issue and train of thought at the moment since the ascii table is fairly large in comparision to the 64 character base64 table I made.

I understand now about the numbers, the binary number is indeed very different between input and output I was thinking wrong~ Anyway back to the drawing board and will take into consideration what youve said here.

Ninjikiran 0 Newbie Poster · Answer 2 · 2007-12-13T15:14:01+00:00

I figured it out and its working for almost every case but it doesnt work with odd spacing. I dont understand that as much but will be fixable~

(Like pressing the space bar 3 times produces this [space]#[space] , [space] being just well a space.

Ninjikiran 0 Newbie Poster · Answer 3 · 2007-12-14T04:38:44+00:00

hmm basically like this
say I type in

lets say I encode "Man A", it encodes properly of course "TWFuICAgQQ==" but when I decode it I get the output "Man # A", there are 2 spaces between Man and A, just the forum doesnt seem to like it.

Ninjikiran 0 Newbie Poster · Answer 4 · 2007-12-15T06:22:34+00:00

my encoding algorithm is ok~ Im getting the proper encoding that you have up there just the forum screwed up my spacing a little. MY decode algorithm is what is causing me an issue~
"TWFuICAgQQ==" is decoding as "Man # A"

I'll post my code down here, though rather then using SCASB I used normal looping which I am more comfortable with logic wise~. I been trying some stuff like if output[0] and output[2] == 20h then output[1] ==20h but that mess's up the code even more and I still have other minor bugs with decoding that just do not seem to make sense.

dencode64:
	;Read Input File (4 bytes)
	mov ah, 3fh
	mov bx, input_handle
	mov cx, buff4b
	mov dx, offset input_pointer
	int 21h
	jc end_program ;quit if error
	
	cmp ax,0
	je done2


	out1:
		mov al, input_pointer[0]
		call compare
		mov base64_index[0], bx

		mov al, input_pointer[1]
		call compare
		mov base64_index[1], bx

		mov ax,base64_index[0]
		xchg al,ah
		shl al,2
		shr ax,6
		mov output_pointer[0],al
		
	out2:
		mov al, input_pointer[1]
		call compare
		mov base64_index[0], bx

		mov al, input_pointer[2]
		call compare
		mov base64_index[1], bx
		
		mov ax,base64_index[0]
		xchg al,ah
		shl al, 2
		shr ax, 4
		shl ax, 2
		shr ax, 2
		mov output_pointer[1],al
		
	out3:
		mov al, input_pointer[2]
		call compare
		mov base64_index[0], bx

		mov al, input_pointer[3]
		call compare
		mov base64_index[1], bx
		
		mov ax,base64_index[0]
		xchg al,ah
		shl al, 2
		shl ax, 6
		shr ax, 8
		mov output_pointer[2],al
	
	.if input_pointer[3] == "="
		mov output_pointer[2]," "
	.endif
	.if input_pointer[4] == "="
		mov output_pointer[2]," "
	.endif
;;;;;;;;;;;;;;;;;;;;;;write buffer to file
		mov cx, buff3b ;set number of bye to write
		mov ah, 40h ;write the file
		mov bx, output_handle ;point to the output file
		mov dx, offset output_pointer ;write from buffer
		int 21h ;call dos
		jc end_program ;quit if error
	jmp dencode64

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;#                Retrieve Base64 index number		    #;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;	
	Compare PROC
		mov cx,0
		LC:
		mov bx,cx
		inc cx
		cmp al,base64string[bx]
		JNE LC
		ret
	Compare ENDP

Duoas 1,025 Postaholic Featured Poster · Answer 5 · 2007-12-15T08:58:41+00:00

Your code is somewhat verbose, so it is a little difficult to follow...

I got as far as out2. You are shifting bits off the end of AX. I'm still not sure how you are getting your outputs. But, assuming that input_buffer contains four bytes each in the range 0..63:

; re-arrange bits into their proper slots
        mov     ax, input_buffer[ 0 ]  ; al,ah = [0],[1]
        shl     al, 2
        shr     ah, 4
        add     al, ah
        mov     output_buffer[ 0 ], al

        mov     ax, input_buffer[ 1 ]  ; al,ah = [1],[2]
        shl     al, 4
        shr     ah, 2
        add     al, ah
        mov     output_buffer[ 1 ], al

        mov     ax, input_buffer[ 2 ]  ; al,ah = [2],[3]
        shl     al, 6
        add     al, ah
        mov     output_buffer[ 2 ], al

Another error is converting '=' into spaces in the output. Don't. Just reduce the number of bytes you actually write to the output.

In other words, you would normally write 3 bytes to output. For each '=' decrement that by one:

; cx <-- number of bytes to write
        mov cx, 3
        mov ax, input_buffer[ 2 ]

        cmp al, '='
        jnz ok1
        dec cx

ok1:    cmp, ah, '='
        jnz ok2
        dec cx

        ; write cx bytes to file
ok2:    mov ah, 40h
        mov bx, output_handle
        lea dx, output_pointer
        int 21h

The x86 string opcodes are very powerful and very useful.
Here's something along the lines of C's strcmp():

; prereqs:
        push ds
        pop  es
        cld

        ; find index of char_to_find
        mov al, char_to_find
        mov cx, length_of_string
        lea di, string_to_search
        mov dx, di  ; need to remember this

        repne scasb

        sub di, dx  ; di <-- index into string
        cmp di, length_of_string
        jlt found
        jmp not_found

Of course, this is just an example...

Hope this helps.

Ninjikiran 0 Newbie Poster · Answer 6 · 2007-12-15T09:21:45+00:00

It helps alot~ I tend to understand examples of how things easier.

A question about SCASB, using it in decoding.

Recommended Answers Collapse Answers

All 9 Replies

Recommended Answers