Note: This is a server that is going to be used on a build farm, So 3rd party libraries are no use here.

I've been working on writing a python script that can be used to POST files on to a tomcat server (java), I finished writing the servlet and deployed it and to test its functionality I used cURL and the sever works fine , I was able to upload multiple files of various sizes , from 10mb to 1gb+, After that was done I started on the python module , I fairly new to python and this whole server thing , My python experience is about 2 weeks, so i'm still learning the tricks.

Here is the problem I'm facing , When I send the file to the server I receive a HTTP 200 , that means its successful , but the file is not in the uploads directory. My servlet accepts multipart/form-encoded data , so after some researching I managed to write one , I also use mmap to handle large file uploads.

After hours of looking at the code I decided to add in few print statements in the server so that I know whats happening , the uploading in the servlet takes place in a for loop , so when the program enters this loop I made it print out a message saying that its in the for loop , when I upload a file using curl I get this output , but when I use python I don't get anything

So here is my Python code , Keep im mind that I'm still learning this so there might be mistakes

import urllib2
import httplib
import sys
import os
import optparse
import mmap
import mimetools, mimetypes




def do_upload(options, args):

    host = '127.0.0.1:80'
    selector = '/test_server/upload'
    url = 'http://127.0.0.1:80/test_server/upload'

    if len(args) > 1:
        print "invalid format"

    path = args[0].replace("\\", "/").rsplit("/", 1)[0]
    file = args[0].replace("\\", "/").rsplit("/", 1)[1]

    print path 
    if not os.access(args[0], os.F_OK):
        print "Directory Doesn't exist"
        exit(1)       

    os.chdir(path)
    f = open(file)

    content_type, body = encode_multipart_formdata([file])
    h = httplib.HTTP(host)
    h.putrequest('POST', selector)
    h.putheader('content-type', content_type)
    h.putheader('content-length', str(len(body)))
    h.endheaders()
    h.send(body)
    errcode, errmsg, headers = h.getreply()
    print errcode
    print errmsg
    print headers
    for l in h.getfile():
        print l
    return h.file.read()

    f = open(file, 'rb')
    sys.stdout.write(f.read())
    mmapped_file_as_string = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
    request = urllib2.Request(url, mmapped_file_as_string)
    request.add_header('Content-Type', content_type)
    response = urllib2.urlopen(request)

    mmapped_file_as_string.close()
    f.close()


def encode_multipart_formdata(files):

    BOUNDARY = '----------ThIs_Is_tHe_bouNdaRY_$'
    CRLF = '\r\n'
    L = []

    for (filename) in files:
        L.append('--' + BOUNDARY)
        L.append('Content-Disposition: form-data;  filename="%s"' % (filename))
        L.append('Content-Type: %s' % get_content_type(filename))
        L.append('')
        #L.append(value)
        L.append('')
        content = ""
        for line in open(filename, 'rb'):
            content += line
        L.append(content)
    L.append('--' + BOUNDARY + '--')

    body = CRLF.join(L)
    print body
    content_type = 'multipart/form-data; boundary=%s' % BOUNDARY
    return content_type, body

def get_content_type(filename):
    return mimetypes.guess_type(filename)[0] or 'application/octet-stream'

I'm also including some console screenshot and Wireshark results.
This is the console output after runing the cURL
curl

this is the output from My program
python

As you can see the python stops at "Entering for loop" and the empty brackets should contain the parts converted to string like in the cURL console

My initial doubt was the encoding , the servlet expects a multipart-form/data, so I used to wireshark to check this and here are the results.

Wireshark of cURL
wireshark_curl

Wireshark of my program

wireshark_python1

As you can see they both look similar. except cURL uses HTTP/1.1 and my python uses HTTP/1.0, not sure why that is happening.

So if anyone can guide me in the right direction I'd be more than happy and I hope I have provided enough information , if anyone wants me to post the servlet code I shall do that on request.

Attachments wireshark_python.PNG 5.84 KB

I have not unfortunately experience in this topic of programming, but as I browsed your code, I got wondering if you can mmap file which you just dumped to stdout and which is at end of file after the read? But then I saw those lines are never entered because of return at line 45. Could you explain what you try to do?

Edited 4 Years Ago by pyTony

Yikes , I went through lots of different methods to acheive file upload , One of them was using mmap to upload large files , so what you see actually me trying to use that , I should have also mentioned that everything from 33 to 45 is one method and 47-56 is a different method (using mmap)
when I work I comment them out to test them each individually , It was my fault for not mentioning this. I'll do a qick update.

So far I've gotten no help on this subject and I'm a newbie to python and client/server side development :(

Edited 4 Years Ago by cyberbemon

content = ""
for line in open(filename, 'rb'):
content += line

This looks like long way to say:

content = ''.join(open(filename, 'rb'))

Can not test your code as for responses from server as we have not access for one. Which line is producing the [], I do not catch you printing a list without joining. Also the 'Entering for loop' is from the server?

Which line is producing the[]

That is being printed from the server ,

out.print(parts.toString()+"\n")

That is the line that returns the empty brackets , Its supposed to print the Parts to string (which is nothing but random jibberish ! ) , but the bracket being empty means the server is not receiving anything ! , the for loop is also printed by the server , I added it there to see if it's entering the loop.

Edited 4 Years Ago by cyberbemon

This article has been dead for over six months. Start a new discussion instead.