Outlook 2011 for Mac still adding arbitrary line breaks into plaintext emails

Outlook 2011 on Mac OS X, v14.1.3, for whatever reason, still does not properly support “format=flowed” content-type or “quoted-printable” extensions for plaintext emails. This causes plaintext emails to be sent as mangled messes, full of arbitrarily inserted linebreaks. This appears to be a regression from Entourage, as far as I recall, which never handled plaintext quite this badly, and this is also despite Microsoft’s promises to have “implemented format=flowed”.

This is the last straw. I’ve been a loyal MS Entourage / MS Outlook user since the days of Outlook Express for Mac and Office 2001. But at this point, this software has actively impeded my communications with my friends and colleagues. We’re done.

The Problem

Here’s a really simple illustration of the problem, from the receiver’s end:

See how the URL, which was composed as one plaintext line, gets split up into two lines?

Here is another example, purely from the editor UI (and not even being sent yet). I start with a perfectly good reply saved as a draft:

I make a small wording change and resave:

See that third line? Thanks to the hard line breaks inserted by Outlook (even at composition stage), the line wrap has been mangled. This draft has to be re-wrapped manually, by the tedious process of deleting the newline-based hard line breaks from every line following in the paragraph. That was a short paragraph. Imagine doing that in a long paragraph, from the first line.

To add insult to injury, there is not even a “re-wrap” functionality in the editor, to at least solve this user-interface level problem (as opposed to the protocol level problem). Obviously no one at Microsoft sends plaintext emails anymore.

The Issue

Back when email was first devised, servers didn’t have a lot of memory, and people had pretty tiny terminals with fixed line widths and not a whole lot of processing power to deal with it. The Internet standards for email messages http://www.ietf.org/rfc/rfc2822.txt, RFC2822 Section 2.1.1, defines recommendations for email body text transferred over SMTP:

There are two limits that this standard places on the number of
characters in a line. Each line of characters MUST be no more than
998 characters, and SHOULD be no more than 78 characters, excluding
the CRLF.

The 998 character limit is due to limitations in many implementations
which send, receive, or store Internet Message Format messages that
simply cannot handle more than 998 characters on a line. Receiving
implementations would do well to handle an arbitrarily large number
of characters in a line for robustness sake…

The more conservative 78 character recommendation is to accommodate
the many implementations of user interfaces that display these
messages which may truncate, or disastrously wrap, the display of
more than 78 characters per line…

…it is encumbant upon implementations which display messages
to handle an arbitrarily large number of characters in a line
(certainly at least up to the 998 character limit) for the sake of
robustness.

Basically, the SMTP server can count on messages that come in 80 characters per line (and always less than 1000 characters per line), and email clients can trust that they only have to render up to the 78th column of text. This limitation is hardly useful in the modern age, but persists since it’s part of the standard. And it’s a fine, conservative design model. But now we write some pretty long lines without linebreaking ourselves, so something magical has to happen in the email client itself, like Outlook 2011.

The naive solution, of course, is to slap arbitrary line breaks into the user’s email message at every 78 characters, which is what ye olde email clients (looking at you, pine — how did I ever put up with you…) from yesteryears did (and Outlook 2011 still does). It’s a matter of personal preference whether this is a reasonable solution. Proponents argue that the email will “always look the same” on all devices, including those limited to 78 chars per line.

I (and many others), on the other hand, think the spirit of the RFC is to allow the actual handling client to decide where to break lines. With the exception of source code, it is almost always better for the email client to use the full width of their display, however many characters that might be. Even in the case of source code, it should also not be mangled by the insertion of arbitrary line breaks in them — what if newlines are meaningful in this language, and the author used more than 78 characters per line? The example with the URI is illustrative of this problem — the URI got an arbitrary newline in the middle, destroying its meaning. Users who copy-paste the two lines will end up getting a 404, due to that stupid inserted newline in the middle of it. This should not be allowed to happen.

Because this naive solution was not perfect, an extension was proposed as RFC 2646. This format of email is characterized by the content-type:

Content-type: text/plain; charset=US-ASCII; format=flowed

In format=flowed emails, the sending and receiving email clients are allowed to reflow the text based on user linebreaks. It follows some simple reflowing rules, but in short it will preserve user-inserted hard line breaks while adjusting the rest of the message for the proper line length while the message is “on the wire”, and recombining the lines on receipt and display. Modern email clients like Thunderbird, designed for user comfort and the generous system limitations of the year 2011, implement this standard.

Guess what format Outlook 2011 sends?

Content-type: text/plain; charset="US-ASCII"

Not even an option to change that behavior. It does not appear that Outlook 2011 deals with any of this. It just inserts some line breaks and calls it a day.

An alternative, implemented by Apple’s Mail.app, is to send messages with the Content-Transfer-Encoding header set to “quoted-printable”, as per RFC 2045. In this model, soft line breaks are sent explicitly with the character “=” representing it, breaking at the usual 70-odd character column. On the receiving end, the client processes this character as a no-op and concats the line back together for display.

Outlook doesn’t do that either. It just wants to mangle your emails.

Conclusion

The world moved on and adopted HTML emails, which doesn’t have this newline problem. For those of us who do think HTML emails are an atrocity to be used sparingly, if at all, the idiosyncrasies of plaintext email have to be addressed. Outlook 2011 appears to do even worse than Entourage 2008 at this problem, by not dealing with it at all. And apparently getting a bunch of Microsoft “MVPs” on their forums to cloud the issue with promises of support and unrelated commentary.

Given the sad state of email clients on the Mac, I believe Thunderbird is now my only option for sane plaintext messaging.

A wishlist for a native Mac email client

As the tech-savvy sort, I often do some things that the typical consumer might not care about. However, it is strange that I simply cannot find a native Mac email client on the market that fulfills what I consider to be very basic features for decent email management. While everyone is gushing over the latest social network to be jammed into an email client, I just want my email to work in a sane way.

The features I’m looking for:

  • POP3/IMAP + SSL support
  • Full-text search, or indexable by OS X’s Spotlight
  • Multiple account support
  • Archival — if I delete an email account, it should not wipe out all emails from local storage that belonged to this account, especially if said account was POP3
  • Plaintext composition — I’m taking the side of format=flowed in plaintext composition. Manually formatting hard line breaks at char 78 is an insane holdover from a bygone age when dumb ASCII terminals were still the primary user interface. At least allow this as an option, or support the quoted-printable content type and soft line-breaks during composition.
  • Filtering rules — Basic filters that let me decide where to put messages based on mail headers and subject is enough
  • mbox import/export — I need to be able to import mail from my previous client. Similarly, in the scenario that this app is no longer supported, I need the option of moving to another client.
  • Bonus: auto-bcc to an arbitrary email address — I’ve never solved my problem with archiving sent mail. All you really need to do is auto-populate the BCC field on any “New Mail” composition window with this address. That’s all I really need to be happy here.

Amazingly enough, there is not a single Mac mail client that fulfills all of these basic conditions. Especially egregious problems for the top three mail clients:

  • Microsoft Outlook Mac 2011 — mangles all outgoing plaintext emails by inserting hard line breaks. No support for flowed plaintext, despite promises to the contrary, and no support for quoted-printable content-type and soft line breaks. No provision for auto-bcc.
  • Apple Mail (Snow Leopard) — deleting an old email account deletes ALL mail belonging to this account, even for downloaded POP3 mail. WTF. Can only auto-bcc “myself”, which is a fixed email address corresponding to the sending account; if you allow autobcc, why not let the user pick the email to autobcc to?
  • Mozilla Thunderbird 8 — deleting an account deletes all downloaded mail belonging to this account, unless messages are stored to the “local folder” rather than its inbox abstractions.

For fear of accidental data loss, I’ve stuck to the Microsoft offering, despite its incredible inability to keep plaintext mail intact. The newer social clients are all Thunderbird-based and leave little else worth examining — they seem far more interested in integrating ever more social network APIs than to support more basic email features. The rest of my friends are all sworn Gmail webmail users and think I’m a strange luddite for even considering native clients. I’ll leave the webmail vs native client debate for fear of going on a page-long rant; suffice to say that the user experience between the two options is not comparable at all.

It almost makes me want to start writing a new email client for myself, or hire someone to start some skeleton code at least, because obviously no one else is going to scratch this itch of mine. That kind of spare time and money, however, would probably never come unless I manage to actually sell a company or two (rather than just founding unsuccessful ones).

Email servers and the MAIL FROM syntax

If you’re into chatting with SMTP servers via telnet ( only for debugging purposes, I swear :p ), be aware that some new email servers strictly interpret RFC 2821 and appears to reject MAIL FROM and RCPT TO addresses of the form [email protected]. Instead, it will return a 555 5.5.2 Syntax error. The proper solution is to enclose the address brackets, as in <[email protected]>.

A redacted transcript:

HELO example.com
250 Hello
MAIL FROM: [email protected]
555 5.5.2 Syntax error.
MAIL FROM: <[email protected]>
250 2.1.0 OK
RCPT TO: [email protected]
555 5.5.2 Syntax error.
RCPT TO: <[email protected]>
250 2.1.5 OK
DATA

....

I’m sure this is intended so that clients that send “Joe Schmoe <[email protected]>” are correctly interpreted. I’m not sure if the RFC asks for this strict enforcement (on my admittedly cursory reading), though and yet a fair number of servers are rejecting the alternate address-only syntax. Wacky.

Entourage sent-mail archival, episode 2


Previously, on The Sarth Repository

I had this setup going on to automatically redirect most messages I send to a repository for later search and retrieval…A month later, by pure chance, I realized that Entourage wasn’t quite deactivating the CC field on the [redirected] archival email. In essence, all the people I cc’ed on anything got spammed with a duplicate every time I sent a message… 

And now, the continuation…

So Google finally enabled IMAP for my accounts on thallos.org, which allowed me to test a new strategy for archiving sent mail. Again, the goal is to have a copy archived straight from Entourage, whenever I send a new email, to my mail repository. With proper IMAP access, however, this became much easier.

First, configure Entourage for IMAP access to Gmail / Google Apps. This is surpisingly non-trivial, since Entourage is not a supported client as of the time of this post. Rather strange, considering that Entourage must be at least second or third place in terms of install-base for Mac email clients. Follow the generic instructions for IMAP setup, and you should do okay. If you’re on Google Apps, the username is your_name@your_domain.tld, as per this configuration instruction.

You should have an IMAP structure for your Gmail boxes once this is complete. Simply set a rule in Rules -> Outgoing, for all messages, to copy the message to the Gmail/Sent Mail folder. In fact, this is the exact same approach if you were backing up to an IMAP-enabled mail server.

Unfortunately, It broke for me on a couple of messages. Gmail servers reported inconsistent failure messages, such as “Connection to the server failed or was dropped” and “The message could not be copied.” Some message headers also seemed to be mangled in transit, with the sender’s name dropped and so forth. The messages themselves were innocuous, text-only messages with no attachments, HTML, or any other random nonsense, so I find it very curious to be failing on these messages. Will have to look into it a bit more.

UPDATED Nov 22, 2007
See the exciting (yet depressing) episode 3 of my adventures in email archival.

Entourage: thwarting archival strategies since 2004

So I use Microsoft Entourage as my main email client, and had been wanting for some time to get my messages exported out of my local drive. As much I trust my laptop and my backups, one good earthquake later and all of that would be futile.

Getting my message archives preserved (with all metadata intact, like Sent and Received dates, etc) was the easy part. Grabbing all future messages was the hard one. Of course, Microsoft, in its infinite wisdom, didn’t include an auto-bcc for Entourage.

I had this setup going on to automatically redirect most messages I send to a repository for later search and retrieval. I had a process set up where, except for select messages that I mark as confidential, the above rule gets triggered.

A month later, by pure chance, I realized that Entourage wasn’t quite deactivating the CC field on the redirect for archival. There is a bug that resends the message to all CC’ed emails on redirect. For example, if I were sending to [email protected], cc’ed to [email protected], and redirecting to [email protected]:

1. the first copy goes out to a and b.
2. Then, the redirected copy will be sent to archive and b, as b appears on the CC list.
3. End result: a receives 1 copy, b receives 2 copies, and archive receives 1 copy.

In essence, all the people I cc’ed on anything got spammed with a duplicate every time I sent a redirected copy via Entourage’s Outgoing rule. This is stupid, and Microsoft’s website doesn’t warn you about this. Try it for yourself if you don’t believe me.

Had I been more diligent at searching the web or even just testing out this archival strategy, this wouldn’t have happened. Plus, I would have noticed one fellow complaining that all contacts on the CC list, for every email, received a copy of his archived messages. Ouch. I’m glad I didn’t try redirecting all of my sent box (there is another strategy, which I will outline sometime, is far easier – but it can’t do real-time, auto-bcc).

To all the people whom I inadvertently spammed, I’m awfully sorry. This won’t happen again.