0

Hi, so in essence I have two little scripts: a server side script and a client side script. My client script sends a request to the server, and the server sends a string to the client. Once I get the string back into the client, I set it to the variable myString (the value is "20,17,18,19,"). After, I run the following code:

myList = []
parts = myString.split(",")
for x in parts:
    myList.append(x)
print myList

The following is returned:

I'm not sure, but is this hex or something? And why am I getting this kind of behavior? Thanks in advance.

2
Contributors
10
Replies
11
Views
7 Years
Discussion Span
Last Post by SoulMazer
Featured Replies
  • It seems that your communication protocol adds null characters [icode]"\x00"[/icode] in the string. The first thing you could do is remove these characters with [code=python] myString = myString.replace("\x00", "") [/code] The other question, as you said, is to understand why your server sends these null chars. Without the server and … Read More

  • It's not a string, it's a unicode string. That could be the problem. Try to send str(allLists). Read More

  • I think I've found what async_chat did with your unicode string, it encoded it in utf32 before transmitting it. Here are a few tests with python 2.6 [code=python] >>> s = '2\x00\x00\x000\x00\x00\x00,\x00\x00\x001\x00\x00\x007\x00\x00\x00,\x00\x00\x001\x00\x00\x008\x00\x00\x00,\x00\x00\x001\x00\x00\x009\x00\x00\x00,\x00\x00\x00' >>> t = s.decode('utf32') >>> t u'20,17,18,19,' >>> t.encode('utf32') '\xff\xfe\x00\x002\x00\x00\x000\x00\x00\x00,\x00\x00\x001\x00\x00\x007\x00\x00\x00,\x00\x00\x001\x00\x00\x008\x00\x00\x00,\x00\x00\x001\x00\x00\x009\x00\x00\x00,\x00\x00\x00' [/code] When encoding, a header with 4 bytes … Read More

1

It seems that your communication protocol adds null characters "\x00" in the string. The first thing you could do is remove these characters with

myString = myString.replace("\x00", "")

The other question, as you said, is to understand why your server sends these null chars. Without the server and client's code, it's difficult to guess.

0

Okay, removing the null characters from the string worked, except I'm still curious to why this happens and how I can prevent having to remove the null characters in the first place.

Here is my code for the server:

#!/usr/bin/python
# ConnHandler.py

import sys
import socket
import asyncore
from asynchat import async_chat
import DatabaseHandler

"""
Overview of script:

The script contains three main classes: CommandHandler, MainChannel, and MainServer. MainServer is the raw server
that sits and waits for connections from clients. Once a client is connected, they more to the "MainChannel", where
they can then send commands, which are all interpreted by CommandHandler.
"""

class CommandHandler():
    
    def __init__(self, session, data):
        'This method is called whenever the server recieves a possible command.'
        if not data.strip(): return
        databaseHandle = DatabaseHandler.Database()
        if data.strip() == "/getlists":
            allLists = databaseHandle.getLists() # Grab available list numbers from database
            session.push(allLists) # Sends a list containing the list numbers available to the client

class MainChannel(async_chat):
    
    def __init__(self, server, sock):
        async_chat.__init__(self, sock)
        self.server = server
        self.set_terminator("\r\n")
        self.data = []
        self.name = None
    
    def collect_incoming_data(self, data):
        self.data.append(data)

    def found_terminator(self):
        'When data is received, it is caught by this method and then is sent to the "CommandHandler"'
        line = ''.join(self.data)
        self.data = []
        CommandHandler(self, line)

class MainServer(asyncore.dispatcher):

    def __init__(self, port):
        asyncore.dispatcher.__init__(self)
        self.port = port
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.bind(("", port))
        self.listen(5)
        #print "listening on port", self.port

    def handle_accept(self):
        'When a client connects to the server, they are recognized here and they are sent to the "MainChannel"'
        conn, addr = self.accept()
        MainChannel(self, conn)

port = 5019
server = MainServer(port)
asyncore.loop()

The code contains a summary and some comments which should hopefully prevent confusion. My "DatabaseHandler" import is simply another file which solely interacts with the database, and so I don't think it should be a problem, but correct me if I am wrong.

Thanks again.

0

The \x00 characters look like characters added by a serialization protocol like marshal.dumps.
What is the type of the variable allLists that you send to async_chat.push ? If it's not a string, I think you should serialize it with pickle

session.push(pickle.dumps(allLists))

and then on the client side

data = pickle.loads(myString)

Edited by Gribouillis: n/a

0

Hmm. Well the 'allLists' variable is indeed a string, to be more exact, it's value is "17,20,18,19,". I have no idea what could be going on here, as I wrote a full fledged instant messaging program with this exact same library and never had a similar problem.

Thanks again.

0

Did you print the repr(allLists) before sending it to see if it contains the null chars ?

Edited by Gribouillis: n/a

0

Sadly that doesn't seem to hint at anything either. It simply returns u'20,17,18,19,' . Is it time to just let the asynchat module win? I could just cope with the null characters if necessary.

0

Oh wow, well you just hit it right on the head. I had no idea that it would be so picky about the encoding. Well, thank you for all your wonderful help. Problem solved.

Edited by SoulMazer: n/a

1

I think I've found what async_chat did with your unicode string, it encoded it in utf32 before transmitting it. Here are a few tests with python 2.6

>>> s = '2\x00\x00\x000\x00\x00\x00,\x00\x00\x001\x00\x00\x007\x00\x00\x00,\x00\x00\x001\x00\x00\x008\x00\x00\x00,\x00\x00\x001\x00\x00\x009\x00\x00\x00,\x00\x00\x00'
>>> t = s.decode('utf32')
>>> t
u'20,17,18,19,'
>>> t.encode('utf32')
'\xff\xfe\x00\x002\x00\x00\x000\x00\x00\x00,\x00\x00\x001\x00\x00\x007\x00\x00\x00,\x00\x00\x001\x00\x00\x008\x00\x00\x00,\x00\x00\x001\x00\x00\x009\x00\x00\x00,\x00\x00\x00'

When encoding, a header with 4 bytes is added, but this is systematic because if you encode the empty unicode string, you get

>>> empty = u''
>>> empty.encode('utf32')
'\xff\xfe\x00\x00'

so async_chat removed this header.
So an alternative to pushing str(allLists) is to write myString = myString.decode('utf32') in the client. This allows you to pass unicode strings (but in that case, you should always pass unicode strings).

Edited by Gribouillis: n/a

0

Wow, thank you for the wonderful follow up. That's a great explanation of the logistics of async_chat encoding.

Thank you again.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.