Archive for December 2008

DocPreview – browser plug-in to view Microsoft Word documents in Safari

UPDATE: As early 2009, Schubert|IT’s Word Browser Plugin has been made a Universal binary. Therefore, DocPreview is no longer needed. This post remains for historical curiosity. The Word Browser Plugin offers a more polished interface for the same task. Go try it out first. I will tweak DocPreview ( more as a challenge to myself to see if I can do what the good folks at SchubertIt have done ), though the original purpose of this plugin (filling the gap for Word Browser Plugin) has been served. Thanks for trying it out.

01/24/2010 – DocPreview updated for 64-bit Safari and 10.6. Still seems to work.

DocPreview is a lightweight WebKit browser plugin I wrote to display a text-only preview of Microsoft Word .doc documents, inline and within Safari 3.x (and, apparently, 4.x) on both Intel and PowerPC Macs. This behavior is much like the functionality provided by the PPC-only Word Browser Plugin from Schubert|IT. DocPreview, of course, is a universal binary plug-in — since Word Browser’s lack of Intel support is really why I wrote this plug-in.

DocPreview only supports WebKit-based browsers that can use .webplugin files. I am currently unable to support other browsers, since DocPreview is built against the WebKit API instead of the Netscape plug-in API. Frankly, the Netscape API is a mess, and I haven’t had the time to figure it out. A NSAPI guru who could point me in the right direction would be much appreciated.

Download
DocPreview.zip (10.5+) – v0.1
DocPreview.zip (10.4) – v0.1

This is a proof-of-concept release. It passes the worked-for-me test — as in, I’m using this plug-in daily. While I believe it functions correctly, I make no guarantees as to stability and usefulness, and am not liable for blowing up your browser or any harm that might befall you through your adventurous use of this plug-in. Although, I do want this thing to work for everyone, so please leave me a note here if it doesn’t work.

(UPDATE 02/24/2009) – I see a few downloads already in my server logs, but no comments at all on this post. Can I assume it works for all of you out there? Or was it so broken that no one’s browser lived to tell the tale? :)

DocPreview features & limitations:

  • Universal binary support, for use on both PPC and Intel Macs. No Windows Safari support — and plus, there are already good solutions for inline doc browsing on the Windows side. If you’re on PPC, I suggest you use Word Browser Plugin instead unless you desperately need find-on-page using Safari’s built-in facilities with Word docs.
  • Tested extensively on Safari 4.0b1, 3.2.x, and somewhat with Shiira 2.x. May also work on earlier Safari versions, but I simply do not have the ability to test the plugin against them.
  • Uses OS X’s internal engine for opening and processing Word files. DocPreview performs as well (or as badly, depending on your opinion) as OS X itself for the same task.
  • Supported on OS X 10.5 for .doc, .docx, and .odt files. On OS X 10.5, the plugin can parse Microsoft’s new OpenXML (.docx) and OO.org’s ODF Text (.odt) documents. On 10.4, the plugin will still work, but only for .doc files.
  • Support for full document view mode or embedded mode (if the page author uses object and embed tags, like with Flash objects — although, who actually does that with .doc files? )
  • In full document view mode, DocPreview uses Safari’s built-in Find and Text Zoom abilities. Any command that can be run on a normal Safari webpage can be performed on a DocPreview rendering.
  • In embedded mode, Find on Page and Text Zoom support are implemented separately. I’m still thinking about how I can hook into Safari’s built-in system from within an embedded plugin. Since no one ever uses embedded .doc files anyway, the point is fairly moot for most users.


To install, drag DocPreview.webplugin into
/Users/<your name>/Library/Internet Plug-ins
or
/Library/Internet Plug-ins.

Screenshots of the plug-in in operation can be seen to the right of this post. The first pair of images are .doc files in full document mode in the browser and in Word 2008. The second pair of images are two documents embedded inline using object tags, and one of them shown in Word 2008. As you can see, the conversion fidelity is fairly decent — this is the same level of fidelity that you would have gotten by importing Word data into TextEdit, or using textutil on the command-line.

DocPreview serves the same purpose as Word Browser. It’s intended as a quick preview (much like how Google indexes the text in .doc files), so you don’t blindly download any Word files that actually don’t interest you.

( As an aside, I cannot understand people who want to disable built-in PDF support in Safari :p . That feature has singlehandedly improved my productivity/research output by a magnitude. )

In any case, this is a quick hack, since I expect Apple to support .doc previews with the next version(s) of Safari, very soon. MobileSafari on the iPhone already provides .doc viewing support (and .xls, if I remember correctly). On Windows, MS provides read support for .doc files within the browser. DocPreview is just a temporary solution for Safari users until official Apple support (or better yet, better Microsoft Office integration) arrives.

If the thing doesn’t work for you, let me know via the comment thread. If it does work, let me know too :) Any suggestions and comments welcome.

Change IP address via (ab)use of DHCP client id

Nowadays customers using cable modems tend to be re-assigned the same IP address once they get one. This is apparently done based on the client MAC address — unless that IP was already taken from the pool when the client connects, the DHCP server will tend to re-issue the old IP. Fairly nice, in that even for “dynamic IP addresses”, the actual IP address assigned rarely changes unless you go offline for an extended period of time.

Problem comes when you actually want to change to a new IP, and can’t wait around for days without a network connection. Usually this is accomplished by changing the MAC address on your machine or router, in a process called spoofing or cloning. Most routers will also have a MAC cloning feature. Swap out the MAC address, and you will be assigned a new IP the next time you connect to the DHCP server.

Except, of course, the Apple AirPort and AirPort Express routers don’t actually have this feature. This is understandable, considering that MAC addresses are supposed to be globally unique so as to avoid network problems — allowing the user to change it willy-nilly will probably not serve that goal. Nevertheless, it is quite annoying in this case, as you won’t be able to change IP with an AirPort router using the MAC cloning method.

There’s one trick that might work for you here, though. A few ISPs’ DHCP servers are set up to accept DHCP Client IDs — an optional field that identifies a DHCP client. While they default to use MAC address as a client identifier by default, they will treat you as a different DHCP client for IP assignment if you start using or change a Client ID. All that is required to obtain a new IP address, then, is to add a Client ID or change it. There is an interface for this in the AirPort Utility, as seen in the screenshot. This is far easier than trying to clone a MAC address on AirPort routers.

Not all ISPs support use of the DHCP Client ID, so this may not work for everyone. Since there are uniqueness requirements involved with these client IDs, it is a good way to screw up DHCP assignments if two clients claim the same ID. If your ISP does support this, make sure to pick a unique client ID.

This trick appears to work for Cox Cable at my current location. It does not appear to work for Comcast.

Bad Google cookie kills Safari

In a bizarre case of digital food poisoning, I experienced a series of mysterious, persistent, reproducible crashes with Safari 3.2.1 this morning, traceable to a bad Google cookie.

The symptoms

Google has a nifty query suggestion feature that is turned on by default on its homepage search box. Whenever I typed in a phrase query (e.g. +"query suggestion" +"Google features") with the suggestion feature turned on, the browser crashed with a SIGSEGV around 30% of the time.

Excerpt from the crash log:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x000000001bdca240
Crashed Thread:  0

Thread 0 Crashed:
0   ???                            0x16619e75 0 + 375496309
1   com.apple.WebCore              0x94325ea0 WebCore::AutoTableLayout::fullRecalc() + 704
2   com.apple.WebCore              0x9432581a WebCore::AutoTableLayout::calcPrefWidths(int&, int&) + 26
3   com.apple.WebCore              0x943252b8 WebCore::RenderTable::calcPrefWidths() + 56
4   com.apple.WebCore              0x9431c04b WebCore::RenderBox::minPrefWidth() const + 27
5   com.apple.WebCore              0x9432507c WebCore::RenderTable::calcWidth() + 124
6   com.apple.WebCore              0x943241a8 WebCore::RenderTable::layout() + 392
...

In the remainder of the cases, when it does not crash immediately, a JavaScript error is logged to the browser error console (to access, go to Develop -> Show Error Console)
SyntaxError: Invalid regular expression: nothing to repeat
http://www.google.com/extern_js/f/CgJlbhICdXMrMAc4AiwrMAo4EywrMA44AywrMBg4Ayw/nMD0sKnpeG0.js (line 21)

for every letter that I type into the search box. During this time, no query suggestion is made.

Diagnostics

  • I have never used an InputManager or “plug-in” to Safari
  • The same crash does NOT happen under a fresh new user account created for diagnostic purposes
  • Clearing the browser cache, temp files, hidden cache files ( getconf DARWIN_USER_CACHE_DIR ), etc. did not help.
  • Deleting Safari preferences did not help.

Solution

After applying a divide-and-conquer strategy to the entire ~/Library directory (not made any easier by Finder’s obstinate resistance to my attempt to move subdirectories within the Library directory, despite having the appropriate permissions — had to drop to Terminal for this), I traced it to the ~/Library/Cookies directory. Moving away the Cookies.plist file contained within cured the crash, the lack of query suggestions, and the Javascript error. More specifically, deleting all Google-related cookies within the Cookies file also accomplished the same thing.

Remarks

Some combination of a bad cookie and bad regexes appears to have triggered a crash bug in this version of WebKit / WebCore. You wouldn’t think a bad cookie could take down a browser. But apparently it does.

I dearly hope this is not a potential buffer overflow or other security problem within WebKit.