Hi, so in essence I have two little scripts: a server side script and a client side script. My client script sends a request to the server, and the server sends a string to the client. Once I get the string back into the client, I set it to the variable myString (the value is "20,17,18,19,"). After, I run the following code:

myList = []
parts = myString.split(",")
for x in parts:
    myList.append(x)
print myList

The following is returned:

I'm not sure, but is this hex or something? And why am I getting this kind of behavior? Thanks in advance.

Recommended Answers

All 10 Replies

It seems that your communication protocol adds null characters "\x00" in the string. The first thing you could do is remove these characters with

myString = myString.replace("\x00", "")

The other question, as you said, is to understand why your server sends these null chars. Without the server and client's code, it's difficult to guess.

Okay, removing the null characters from the string worked, except I'm still curious to why this happens and how I can prevent having to remove the null characters in the first place.

Here is my code for the server:

#!/usr/bin/python
# ConnHandler.py

import sys
import socket
import asyncore
from asynchat import async_chat
import DatabaseHandler

"""
Overview of script:

The script contains three main classes: CommandHandler, MainChannel, and MainServer. MainServer is the raw server
that sits and waits for connections from clients. Once a client is connected, they more to the "MainChannel", where
they can then send commands, which are all interpreted by CommandHandler.
"""

class CommandHandler():
    
    def __init__(self, session, data):
        'This method is called whenever the server recieves a possible command.'
        if not data.strip(): return
        databaseHandle = DatabaseHandler.Database()
        if data.strip() == "/getlists":
            allLists = databaseHandle.getLists() # Grab available list numbers from database
            session.push(allLists) # Sends a list containing the list numbers available to the client

class MainChannel(async_chat):
    
    def __init__(self, server, sock):
        async_chat.__init__(self, sock)
        self.server = server
        self.set_terminator("\r\n")
        self.data = []
        self.name = None
    
    def collect_incoming_data(self, data):
        self.data.append(data)

    def found_terminator(self):
        'When data is received, it is caught by this method and then is sent to the "CommandHandler"'
        line = ''.join(self.data)
        self.data = []
        CommandHandler(self, line)

class MainServer(asyncore.dispatcher):

    def __init__(self, port):
        asyncore.dispatcher.__init__(self)
        self.port = port
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.bind(("", port))
        self.listen(5)
        #print "listening on port", self.port

    def handle_accept(self):
        'When a client connects to the server, they are recognized here and they are sent to the "MainChannel"'
        conn, addr = self.accept()
        MainChannel(self, conn)

port = 5019
server = MainServer(port)
asyncore.loop()

The code contains a summary and some comments which should hopefully prevent confusion. My "DatabaseHandler" import is simply another file which solely interacts with the database, and so I don't think it should be a problem, but correct me if I am wrong.

Thanks again.

The \x00 characters look like characters added by a serialization protocol like marshal.dumps.
What is the type of the variable allLists that you send to async_chat.push ? If it's not a string, I think you should serialize it with pickle

session.push(pickle.dumps(allLists))

and then on the client side

data = pickle.loads(myString)

Hmm. Well the 'allLists' variable is indeed a string, to be more exact, it's value is "17,20,18,19,". I have no idea what could be going on here, as I wrote a full fledged instant messaging program with this exact same library and never had a similar problem.

Thanks again.

Did you print the repr(allLists) before sending it to see if it contains the null chars ?

Sadly that doesn't seem to hint at anything either. It simply returns u'20,17,18,19,' . Is it time to just let the asynchat module win? I could just cope with the null characters if necessary.

It's not a string, it's a unicode string. That could be the problem. Try to send str(allLists).

commented: neat +10

Oh wow, well you just hit it right on the head. I had no idea that it would be so picky about the encoding. Well, thank you for all your wonderful help. Problem solved.

I think I've found what async_chat did with your unicode string, it encoded it in utf32 before transmitting it. Here are a few tests with python 2.6

>>> s = '2\x00\x00\x000\x00\x00\x00,\x00\x00\x001\x00\x00\x007\x00\x00\x00,\x00\x00\x001\x00\x00\x008\x00\x00\x00,\x00\x00\x001\x00\x00\x009\x00\x00\x00,\x00\x00\x00'
>>> t = s.decode('utf32')
>>> t
u'20,17,18,19,'
>>> t.encode('utf32')
'\xff\xfe\x00\x002\x00\x00\x000\x00\x00\x00,\x00\x00\x001\x00\x00\x007\x00\x00\x00,\x00\x00\x001\x00\x00\x008\x00\x00\x00,\x00\x00\x001\x00\x00\x009\x00\x00\x00,\x00\x00\x00'

When encoding, a header with 4 bytes is added, but this is systematic because if you encode the empty unicode string, you get

>>> empty = u''
>>> empty.encode('utf32')
'\xff\xfe\x00\x00'

so async_chat removed this header.
So an alternative to pushing str(allLists) is to write myString = myString.decode('utf32') in the client. This allows you to pass unicode strings (but in that case, you should always pass unicode strings).

Wow, thank you for the wonderful follow up. That's a great explanation of the logistics of async_chat encoding.

Thank you again.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.