0

hi,
I try to write a app to automatically re-organize my mails (thunderbird).
The first thing I try is to re-create my folders tree, once for each year.
So i need to read mbox files and rewrite them

import mailbox

mbx=mailbox.mbox("./in_mbox")
mbx.lock()
of=open("out_mbox", "w")
for k, m in mbx.iteritems():
    of.write(m.as_string())
mbx.unlock()
of.close()

My problem is that during this process, I lose the "From " line between original file mails

in_mbox :

From - Mon Jun 16 08:54:05 2008
X-Account-Key: account2
X-UIDL: 919-1206101190
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Return-path: <adress@prv.com>
Received: from ...

out_mbox :

X-Account-Key: account2
X-UIDL: 919-1206101190
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Return-path: <adresse@prv.fr>
Received: from ...

"From " line is missing.

I try to get it with everything i could in email.Message object (get_all, get_unixfrom...) but i couldn't find the solution.

Does anyone know what i have missed ?
Thanks.

2
Contributors
9
Replies
10
Views
6 Years
Discussion Span
Last Post by jice
0

As it is generated by thunderbird, I have no way to change its format in my situation.
Except if i don't retrieve the "From " line and generate one myself (which should be ok but not very clean).

0

Also mbox is not a real format:
http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/mail-mbox-formats.html

Thanks for your links...
This one I knew but not the previous one.
In fact, with my little piece of code, i can read a thunderbird file and navigate through the mails. So, the most important is done.
My only problem is that i can't retrieve datas that are stored in the "From " line : the mailbox.mbox class recognise it but doesn't retrieve the datas.

As far as i can see, the mailbox.mbox and email.Message modules won't give me the answer so i'll look if i can overload the mbox class to deal with thunderbird's format or, if it's too complicated for me, i'll create a brand new "From " line that will have the same format (but not exactly the same datas).

Any help is welcome.
Thanks for having taken time to read my posts and help.

0

If you got something usable, you might publish it in the code snippets.
Google cannot answer this question...

1

By exploring the mailbox module code, i finally found what i was looking for.
It is the "get_from" function, which is also mentionned in the module documentation...
So, it works allright now.

I don't want to convert my mbox files because what i want to do is :
- automatically reorganize my mails (processed with thunderbird) by creating one folder per year
- automatically detach attached files and replace them with a html file to keep the link to reduce the mailbox size.

So, at the end, i want to have my mailbox with exactly the same format as before.

Thanks

Comments
good! post your mbox parsing code when it's finished!
0

So, the working code would be :

import mailbox

mbx=mailbox.mbox("./in_mbox")
mbx.lock()
of=open("out_mbox", "w")
for k, m in mbx.iteritems():
    of.write("From %s\n" % m.get_from())
    of.write(m.as_string())
mbx.unlock()
of.close()
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.