Outlook 2011 for Mac still adding arbitrary line breaks into plaintext emails

Outlook 2011 on Mac OS X, v14.1.3, for whatever reason, still does not properly support “format=flowed” content-type or “quoted-printable” extensions for plaintext emails. This causes plaintext emails to be sent as mangled messes, full of arbitrarily inserted linebreaks. This appears to be a regression from Entourage, as far as I recall, which never handled plaintext quite this badly, and this is also despite Microsoft’s promises to have “implemented format=flowed”.

This is the last straw. I’ve been a loyal MS Entourage / MS Outlook user since the days of Outlook Express for Mac and Office 2001. But at this point, this software has actively impeded my communications with my friends and colleagues. We’re done.

The Problem

Here’s a really simple illustration of the problem, from the receiver’s end:

See how the URL, which was composed as one plaintext line, gets split up into two lines?

Here is another example, purely from the editor UI (and not even being sent yet). I start with a perfectly good reply saved as a draft:

I make a small wording change and resave:

See that third line? Thanks to the hard line breaks inserted by Outlook (even at composition stage), the line wrap has been mangled. This draft has to be re-wrapped manually, by the tedious process of deleting the newline-based hard line breaks from every line following in the paragraph. That was a short paragraph. Imagine doing that in a long paragraph, from the first line.

To add insult to injury, there is not even a “re-wrap” functionality in the editor, to at least solve this user-interface level problem (as opposed to the protocol level problem). Obviously no one at Microsoft sends plaintext emails anymore.

The Issue

Back when email was first devised, servers didn’t have a lot of memory, and people had pretty tiny terminals with fixed line widths and not a whole lot of processing power to deal with it. The Internet standards for email messages http://www.ietf.org/rfc/rfc2822.txt, RFC2822 Section 2.1.1, defines recommendations for email body text transferred over SMTP:

There are two limits that this standard places on the number of
characters in a line. Each line of characters MUST be no more than
998 characters, and SHOULD be no more than 78 characters, excluding
the CRLF.

The 998 character limit is due to limitations in many implementations
which send, receive, or store Internet Message Format messages that
simply cannot handle more than 998 characters on a line. Receiving
implementations would do well to handle an arbitrarily large number
of characters in a line for robustness sake…

The more conservative 78 character recommendation is to accommodate
the many implementations of user interfaces that display these
messages which may truncate, or disastrously wrap, the display of
more than 78 characters per line…

…it is encumbant upon implementations which display messages
to handle an arbitrarily large number of characters in a line
(certainly at least up to the 998 character limit) for the sake of
robustness.

Basically, the SMTP server can count on messages that come in 80 characters per line (and always less than 1000 characters per line), and email clients can trust that they only have to render up to the 78th column of text. This limitation is hardly useful in the modern age, but persists since it’s part of the standard. And it’s a fine, conservative design model. But now we write some pretty long lines without linebreaking ourselves, so something magical has to happen in the email client itself, like Outlook 2011.

The naive solution, of course, is to slap arbitrary line breaks into the user’s email message at every 78 characters, which is what ye olde email clients (looking at you, pine — how did I ever put up with you…) from yesteryears did (and Outlook 2011 still does). It’s a matter of personal preference whether this is a reasonable solution. Proponents argue that the email will “always look the same” on all devices, including those limited to 78 chars per line.

I (and many others), on the other hand, think the spirit of the RFC is to allow the actual handling client to decide where to break lines. With the exception of source code, it is almost always better for the email client to use the full width of their display, however many characters that might be. Even in the case of source code, it should also not be mangled by the insertion of arbitrary line breaks in them — what if newlines are meaningful in this language, and the author used more than 78 characters per line? The example with the URI is illustrative of this problem — the URI got an arbitrary newline in the middle, destroying its meaning. Users who copy-paste the two lines will end up getting a 404, due to that stupid inserted newline in the middle of it. This should not be allowed to happen.

Because this naive solution was not perfect, an extension was proposed as RFC 2646. This format of email is characterized by the content-type:

Content-type: text/plain; charset=US-ASCII; format=flowed

In format=flowed emails, the sending and receiving email clients are allowed to reflow the text based on user linebreaks. It follows some simple reflowing rules, but in short it will preserve user-inserted hard line breaks while adjusting the rest of the message for the proper line length while the message is “on the wire”, and recombining the lines on receipt and display. Modern email clients like Thunderbird, designed for user comfort and the generous system limitations of the year 2011, implement this standard.

Guess what format Outlook 2011 sends?

Content-type: text/plain; charset="US-ASCII"

Not even an option to change that behavior. It does not appear that Outlook 2011 deals with any of this. It just inserts some line breaks and calls it a day.

An alternative, implemented by Apple’s Mail.app, is to send messages with the Content-Transfer-Encoding header set to “quoted-printable”, as per RFC 2045. In this model, soft line breaks are sent explicitly with the character “=” representing it, breaking at the usual 70-odd character column. On the receiving end, the client processes this character as a no-op and concats the line back together for display.

Outlook doesn’t do that either. It just wants to mangle your emails.

Conclusion

The world moved on and adopted HTML emails, which doesn’t have this newline problem. For those of us who do think HTML emails are an atrocity to be used sparingly, if at all, the idiosyncrasies of plaintext email have to be addressed. Outlook 2011 appears to do even worse than Entourage 2008 at this problem, by not dealing with it at all. And apparently getting a bunch of Microsoft “MVPs” on their forums to cloud the issue with promises of support and unrelated commentary.

Given the sad state of email clients on the Mac, I believe Thunderbird is now my only option for sane plaintext messaging.

DocPreview – browser plug-in to view Microsoft Word documents in Safari

9/26/2016
Needless to say, DocPreview 2 is now deprecated and nonfunctional, given Safari 10’s changes to how extensions work. Back to the drawing board.
5/12/2015
Four years later, Safari on OS X still does not have native .doc preview capability.

But I have made a breakthrough! I have managed to make DocPreview 2, a Safari extension + NPAPI-based plugin that allows in-browser previews of Microsoft .doc and .docx files, for Safari 8.0.6.

I will be writing another blog post of this as soon as I clean it up enough for release, as DocPreview 2. For all the good it’ll do until NPAPI support is removed.

UPDATE:
7/21/2011 – DocPreview DOES NOT work with Safari 5.1 on OS X 10.6+. This is because the WebKit Plugin APIs it depended on are deprecated by Apple. This has broken a number of .webplugin extensions, including some of my favorites like the current (as of today) version of XML View. If you have upgraded to Safari 5.1, on OS X 10.6 or 10.7, then please uninstall DocPreview 0.1 (by removing it from the /Library/Internet Plugins or ~/Library/Internet Plugins directories, depending on where you initially installed it). For those still on Safari 5.0 and below, this will still work.

Only NPAPI plugins are apparently allowed from 5.1 forward. This API does not give native access to the browser window as before; therefore the old methods of converting .doc files to HTML no longer works. Because I still need Word preview capability for myself, I have a few ideas on how to make DocPreview work for 5.1+. However, this requires a complete re-write of the entire code. Apple obviously already has .doc preview capability on iOS but hasn’t shared that with us on OS X. I’m not sure it’ll be worth it to make these modifications, only for Apple to release .doc preview as a native capability to Safari a few months later.

12/1/2010 – As promised, DocPreview is now open sourced at Google Code.

01/24/2010 – DocPreview updated for 64-bit Safari and 10.6. Still seems to work.

early 2009 – Schubert|IT’s Word Browser Plugin has been made a Universal binary. Therefore, DocPreview is no longer needed, as I originally wrote this plugin to fill in the gap when Word Browser Plugin was PPC-only. Word Browser Plugin offers a more advanced user interface than DocPreview. You should try it out first. I will continue to tweak DocPreview, more as a challenge to myself. I will probably open source the package — at soon as I fix a hack or two and no longer feel ashamed of my code.

DocPreview is a lightweight WebKit browser plugin I wrote to display a text-only preview of Microsoft Word .doc documents, inline and within Safari 3.x (and, apparently, 4.x) on both Intel and PowerPC Macs. This behavior is much like the functionality provided by the PPC-only Word Browser Plugin from Schubert|IT. DocPreview, of course, is a universal binary plug-in — since Word Browser’s lack of Intel support is/was really why I wrote this plug-in.

DocPreview only supports WebKit-based browsers that can use .webplugin files. Therefore, this includes Safari, Shiira, etc., but not Chrome (as of the time of this update). I am currently unable to support other browsers, since DocPreview is built against the WebKit API instead of the Netscape plug-in API. Frankly, the Netscape API is a mess, and I haven’t had the time to figure it out. A NSAPI guru who could point me in the right direction would be much appreciated.

Download
DocPreview.zip (Safari 5.0 and below on 10.5, 10.6) – v0.1
DocPreview.zip (10.4) – v0.1

This is a work-in-progress. It passes the “dogfooding” test; as in, I’m using this plug-in daily (eating my own dogfood, as they say). While I believe it functions correctly — at least on my machines, I make no guarantees as to stability and usefulness, and am not liable for blowing up your browser or any harm that might befall you through your adventurous use of this plug-in. I do want this thing to work for everyone, so please leave me a note here if it doesn’t work.

DocPreview features & limitations:

  • Universal binary support, for use on both PPC and Intel Macs. No Windows Safari support — and plus, there are already good solutions for inline doc browsing on the Windows side. If you’re on PPC, I suggest you use Word Browser Plugin instead unless you desperately need find-on-page using Safari’s built-in facilities with Word docs.
  • Tested extensively on Safari 4.x, 3.x, and somewhat with Shiira 2.x. May also work on earlier Safari versions, but I simply do not have the ability to test the plugin against them. Does not work on Chrome, but I’m sure there are better solutions using Chrome extensions.
  • Uses OS X’s internal engine for opening and processing Word files. DocPreview performs as well (or as badly, depending on your opinion) as OS X itself for the same task.
  • Supported on OS X 10.5 for .doc, .docx, and .odt files. On OS X 10.5, the plugin can parse Microsoft’s new OpenXML (.docx) and OO.org‘s ODF Text (.odt) documents. On 10.4, the plugin will still work, but only for .doc files.
  • Support for full document view mode or embedded mode (if the page author uses object and embed tags, like with Flash objects — although, who actually does that with .doc files? )
  • In full document view mode, DocPreview uses Safari’s built-in Find and Text Zoom abilities. Any command that can be run on a normal Safari webpage can be performed on a DocPreview rendering.
  • In embedded mode, Find on Page and Text Zoom support are implemented separately. I’m still thinking about how I can hook into Safari’s built-in system from within an embedded plugin. Since no one ever uses embedded .doc files anyway, the point is fairly moot for most users.


To install, drag DocPreview.webplugin into
/Users/<your name>/Library/Internet Plug-ins
or
/Library/Internet Plug-ins.

Screenshots of the plug-in in operation can be seen to the right of this post. The first pair of images are .doc files in full document mode in the browser and in Word 2008. The second pair of images are two documents embedded inline using object tags, and one of them shown in Word 2008. As you can see, the conversion fidelity is fairly decent — this is the same level of fidelity that you would have gotten by importing Word data into TextEdit, or using textutil on the command-line.

DocPreview serves the same purpose as Word Browser. It’s intended as a quick preview (much like how Google indexes the text in .doc files), so you don’t blindly download any Word files that actually don’t interest you.

( As an aside, I cannot understand people who want to disable built-in PDF support in Safari :p . That feature has singlehandedly improved my productivity/research output by a magnitude. )

I expect Apple to support .doc previews with the next version(s) of Safari, very soon (ok, so not in 4.x, but surely in 5.x. Anyone on the Apple Safari team reading this: come on, this feature is trivial — even I can do it ). MobileSafari on the iPhone already provides .doc viewing support (and .xls, if I remember correctly). On Windows, MS provides read support for .doc files within the browser. DocPreview is a temporary solution for Safari users until official Apple support (or better yet, better Microsoft Office integration) arrives.

If the thing doesn’t work for you, let me know via the comment thread. If it does work, let me know too. Any suggestions and comments welcome.

Changing keyboard shortcut for "Find Next" in Word 2008

changing Find Next... shortcut in Microsoft Word 2008
Pierre Igot over at Betalogue writes of a way to reassign the keyboard shortcut for the “Find Next” command in Word 2008.

This has bothered me for a very long time. In essence, Office 2008 maps Command-G, the customary Mac OS keyboard shortcut for “Find Next” (in essence, to find the next instance of a string match and part of the set of standard Find/Replace [All] functions in any good text editor / word processor), to “Go To”.

To its credit, Word allows customization of its keyboard shortcuts. Unfortunately, I just could not locate the “FindNext” keyboard shortcut in Tools -> Customize Keyboard to apply customizations to. Turns out it’s actually named “RepeatFind”, as opposed to “EditFindNext”, which you’d expect when the “Find…” command is simply “EditFind”.

Further, it turns out that wouldn’t have mattered anyway. Microsoft overrides the keyboard shortcut selection behavior in View -> Customize Toolbars and Menus. It does so in a ridiculously arcane way, by which the menu name “&Go To” fixes the shortcut key to Command-G. Therefore, no matter what shortcut is set in Customize Keyboard, Command-G will always map to Go To and not Find Next. ARGH.

Even knowing this, it was not obvious how to edit the name either (screenshot). Apparently the solution is to use the fake menu bars that pop up after selecting Customize Toolbars and Menus, and then right-click or Control-click the Go To menu item. That will pop up an edit box with the name of the menu item, at which point you can remove the gratuitous &, such that Command-G will map correctly based on Customize Keyboard.