Wednesday, October 03, 2007

Migrating a tree of mbox files to Mozilla Thunderbird

We are transitioning away from Novell's venerable Netmail server at work. We have asked people with IMAP accounts to download their mail. This does not work for users with extremely large mailboxes. For those users, we have developed a workaround, which is just to make a raw copy of their mail which can then be accessed directly from the Thunderbird mail client's "Local Folders".

Your old imap server stores things like this:

INBOX/
INBOX.box
STUFF/
STUFF.box

Inside the INBOX directory, there are files like this:

TODO.box
DONE.box
LATER.box

Inside STUFF, there are files like this:

STUFF.box
STUFF1/
BOXA.box
BOXB.box
STUFF1.box
STUFF2.box

Basically, the suffix ".box" indicates an mbox-formatted email file, while directories are named without an extension.

Let's suppose that you're retiring your old mail server and you want to copy all of this data to "Local Folders" inside of Thunderbird. But, let's assume that the old IMAP server has died and you only have access to the raw directories and mail files. You know from researching on the web that you can import mbox-formatted mail into Thunderbird. So, how do you import whole directory structures? Here's how we did it.

Directories in Thunderbird have the suffix ".sbd" (SuBDirectory?). Mailboxes (the mbox-formatted files above) have no suffix. So, you need to rename all of the directories from no extension to an ".sbd" extension and you need to remove the ".box" extension from all of the mbox files.

First we get a list of all of the mail directories:

$ find . -type d >/tmp/dirlist

Rename each of the directories to end in ".sbd" This cries out for a short program that walks the directory and renames one level of directories at a time. I did it in a way that's too crude to be shared here.

Then we get a list of all the mailbox names:

find . -name \*.box >/tmp/boxlist

Then we rename all of the mailboxes:

Test first, by doing this:
for i in `cat /tmp/boxlist `;do echo mv $i `dirname $i`/`basename $i .box`;done

If that echoes the proper commands, remove the 'echo' part, and launch the real command.
for i in `cat /tmp/boxlist `;do mv $i `dirname $i`/`basename $i .box`;done

'dirname' returns the directory portion of a filename, while basename returns the file name portion of the file name. In this case, we've also asked it to strip off the ".box" portion of the name.

You should be left with a set of folders and mailboxes that work the way that Thunderbird wants them to work.

No comments: