A Twitter timeline to Atom feed proxy

So Twitter is retiring version 1.0 of its API in March 2013. In its ongoing quest to become (more) evil, Twitter has decided that open syndication standards like Atom are no longer worth supporting. This, in addition to the gratuitously byzantine OAuth system (even for 2-legged auth between my own client and Twitter itself), makes consuming Twitter content anywhere else except on Twitter (and its official apps) an increasingly annoying task. An intended effect, perhaps.

I’m one of the few holdouts who believe current Web standards work just fine. I consume a lot of content from the Web in my native RSS client, as part of my ordinary daily workflow, without using 50 different apps and dozens of notifications. This includes my Twitter home timeline (where I follow just under 70 people/orgs of interest), which under API v1.0 was provided as a simple, standard RSS feed, like other open streams of data on the Web.

I’d been hacking on a Python-based Twitter JSON API to Atom feed proxy for some time, but became increasingly disillusioned with the stupidity of the API — and that it takes two libraries and tons of code to even get the OAuth dance started.

So I thought, in this entire WWW, there must be someone else as annoyed by Twitter’s obstinacy as I am. Sure enough, Russell Beattie developed a single-file PHP script that accesses the authenticated Twitter stream and outputs an Atom feed.

I abandoned my Python code, forked that codebase, and made some minor modifications to suit my personal needs. This patched version is available here:

https://gist.github.com/yimingliu/4735445

All you need to do to use this script is to create your own Twitter app over at https://dev.twitter.com/apps (I called mine “TimelineProxy”), create an OAuth token, and fill in the blanks in the script. Since the new API also has a 100,000 user per application limit, it’s probably best for every user to have his own proxy app with its own token, instead of relying on a central one.

There are some minor differences between my version and the upstream original. Basically this version uses full php tags instead of the less-well-supported short tags, and replaces t.co shortened URLs with their full original URLs. I also take advantage of the html content type in the Atom entry to allow links in the entry text, so in most feed readers any links are “clickable”. Finally, this version returns proper HTTP error codes instead of 200 OK in case of Twitter API errors (like when you hit the rather draconian rate limits on each OAuth token).

However, it preserves the simplicity of the original, which is that you can drop this in the web directory of any PHP-enabled web server (no need for root access or installation of any libraries) and enable your own timeline proxy.

I want to eventually modify this script so that a single server can host any arbitrary user’s timeline as an Atom proxy, provided they give the proper auth tokens. This means having to deal with the OAuth dance at some point. Ugh.

In any case, this disturbing “enclosure movement” of the open Web — taking previously free streams of information and fencing them into walled gardens of content — is a trend that should be opposed whenever possible. My thanks to Mr. Beattie for making the original script.

DocPreview – browser plug-in to view Microsoft Word documents in Safari

9/26/2016
Needless to say, DocPreview 2 is now deprecated and nonfunctional, given Safari 10’s changes to how extensions work. Back to the drawing board.
5/12/2015
Four years later, Safari on OS X still does not have native .doc preview capability.

But I have made a breakthrough! I have managed to make DocPreview 2, a Safari extension + NPAPI-based plugin that allows in-browser previews of Microsoft .doc and .docx files, for Safari 8.0.6.

I will be writing another blog post of this as soon as I clean it up enough for release, as DocPreview 2. For all the good it’ll do until NPAPI support is removed.

UPDATE:
7/21/2011 – DocPreview DOES NOT work with Safari 5.1 on OS X 10.6+. This is because the WebKit Plugin APIs it depended on are deprecated by Apple. This has broken a number of .webplugin extensions, including some of my favorites like the current (as of today) version of XML View. If you have upgraded to Safari 5.1, on OS X 10.6 or 10.7, then please uninstall DocPreview 0.1 (by removing it from the /Library/Internet Plugins or ~/Library/Internet Plugins directories, depending on where you initially installed it). For those still on Safari 5.0 and below, this will still work.

Only NPAPI plugins are apparently allowed from 5.1 forward. This API does not give native access to the browser window as before; therefore the old methods of converting .doc files to HTML no longer works. Because I still need Word preview capability for myself, I have a few ideas on how to make DocPreview work for 5.1+. However, this requires a complete re-write of the entire code. Apple obviously already has .doc preview capability on iOS but hasn’t shared that with us on OS X. I’m not sure it’ll be worth it to make these modifications, only for Apple to release .doc preview as a native capability to Safari a few months later.

12/1/2010 – As promised, DocPreview is now open sourced at Google Code.

01/24/2010 – DocPreview updated for 64-bit Safari and 10.6. Still seems to work.

early 2009 – Schubert|IT’s Word Browser Plugin has been made a Universal binary. Therefore, DocPreview is no longer needed, as I originally wrote this plugin to fill in the gap when Word Browser Plugin was PPC-only. Word Browser Plugin offers a more advanced user interface than DocPreview. You should try it out first. I will continue to tweak DocPreview, more as a challenge to myself. I will probably open source the package — at soon as I fix a hack or two and no longer feel ashamed of my code.

DocPreview is a lightweight WebKit browser plugin I wrote to display a text-only preview of Microsoft Word .doc documents, inline and within Safari 3.x (and, apparently, 4.x) on both Intel and PowerPC Macs. This behavior is much like the functionality provided by the PPC-only Word Browser Plugin from Schubert|IT. DocPreview, of course, is a universal binary plug-in — since Word Browser’s lack of Intel support is/was really why I wrote this plug-in.

DocPreview only supports WebKit-based browsers that can use .webplugin files. Therefore, this includes Safari, Shiira, etc., but not Chrome (as of the time of this update). I am currently unable to support other browsers, since DocPreview is built against the WebKit API instead of the Netscape plug-in API. Frankly, the Netscape API is a mess, and I haven’t had the time to figure it out. A NSAPI guru who could point me in the right direction would be much appreciated.

Download
DocPreview.zip (Safari 5.0 and below on 10.5, 10.6) – v0.1
DocPreview.zip (10.4) – v0.1

This is a work-in-progress. It passes the “dogfooding” test; as in, I’m using this plug-in daily (eating my own dogfood, as they say). While I believe it functions correctly — at least on my machines, I make no guarantees as to stability and usefulness, and am not liable for blowing up your browser or any harm that might befall you through your adventurous use of this plug-in. I do want this thing to work for everyone, so please leave me a note here if it doesn’t work.

DocPreview features & limitations:

  • Universal binary support, for use on both PPC and Intel Macs. No Windows Safari support — and plus, there are already good solutions for inline doc browsing on the Windows side. If you’re on PPC, I suggest you use Word Browser Plugin instead unless you desperately need find-on-page using Safari’s built-in facilities with Word docs.
  • Tested extensively on Safari 4.x, 3.x, and somewhat with Shiira 2.x. May also work on earlier Safari versions, but I simply do not have the ability to test the plugin against them. Does not work on Chrome, but I’m sure there are better solutions using Chrome extensions.
  • Uses OS X’s internal engine for opening and processing Word files. DocPreview performs as well (or as badly, depending on your opinion) as OS X itself for the same task.
  • Supported on OS X 10.5 for .doc, .docx, and .odt files. On OS X 10.5, the plugin can parse Microsoft’s new OpenXML (.docx) and OO.org‘s ODF Text (.odt) documents. On 10.4, the plugin will still work, but only for .doc files.
  • Support for full document view mode or embedded mode (if the page author uses object and embed tags, like with Flash objects — although, who actually does that with .doc files? )
  • In full document view mode, DocPreview uses Safari’s built-in Find and Text Zoom abilities. Any command that can be run on a normal Safari webpage can be performed on a DocPreview rendering.
  • In embedded mode, Find on Page and Text Zoom support are implemented separately. I’m still thinking about how I can hook into Safari’s built-in system from within an embedded plugin. Since no one ever uses embedded .doc files anyway, the point is fairly moot for most users.


To install, drag DocPreview.webplugin into
/Users/<your name>/Library/Internet Plug-ins
or
/Library/Internet Plug-ins.

Screenshots of the plug-in in operation can be seen to the right of this post. The first pair of images are .doc files in full document mode in the browser and in Word 2008. The second pair of images are two documents embedded inline using object tags, and one of them shown in Word 2008. As you can see, the conversion fidelity is fairly decent — this is the same level of fidelity that you would have gotten by importing Word data into TextEdit, or using textutil on the command-line.

DocPreview serves the same purpose as Word Browser. It’s intended as a quick preview (much like how Google indexes the text in .doc files), so you don’t blindly download any Word files that actually don’t interest you.

( As an aside, I cannot understand people who want to disable built-in PDF support in Safari :p . That feature has singlehandedly improved my productivity/research output by a magnitude. )

I expect Apple to support .doc previews with the next version(s) of Safari, very soon (ok, so not in 4.x, but surely in 5.x. Anyone on the Apple Safari team reading this: come on, this feature is trivial — even I can do it ). MobileSafari on the iPhone already provides .doc viewing support (and .xls, if I remember correctly). On Windows, MS provides read support for .doc files within the browser. DocPreview is a temporary solution for Safari users until official Apple support (or better yet, better Microsoft Office integration) arrives.

If the thing doesn’t work for you, let me know via the comment thread. If it does work, let me know too. Any suggestions and comments welcome.

A NAT-PMP client library for Python

Something that I’ve put together and might be somewhat useful to others. I wrote a NAT-PMP (Network Address Translation Port Mapping Protocol) library and testing client in Python. The client allows you to set up dynamic port mappings on NAT-PMP compatible routers. Thus this is a means for dynamic NAT traversal with routers that talk NAT-PMP. In practical terms, this is basically limited to the newer Apple AirPort base stations and the AirPort Express, which have support for this protocol. I’m currently unable to support UPnP (the dominant port mapping protocol in non-Apple routers), though I’m sure either someone has written bindings for Python, or I’ll have to do it eventually once my AirPort Express dies.

In any case, this library puts a thin layer of Python abstraction over the NAT-PMP protocol, version 0, as specified by the NAT-PMP draft standard.  The purpose of this is simple.  I needed to establish port forwarding without rebooting my AirPort router.

Normally, when you need to have a public port on your router forwarded to a private port on your box behind it (say, if you write a server that listens on a port, or have a P2P client running on a port) you’d have to fire up AirPort Utility.app -> Advanced -> Port Mappings, set your mappings, and then hit Update…which reboots the router, killing your network connection and anything you might be doing. There has been no good way to do this programmatically with AirPort routers, when your program needs to negotiate a port forwarding with no manual intervention. Further, there isn’t a command-line tool to open up a port when you’re SSH’ed into your machine but don’t have a GUI.

Creating the port mappings dynamically via NAT-PMP allows port forwarding to happen without the reboot. When my server runs, it can make a call to the router to start forwarding a port.  Furthermore, when I’m done with testing my server, I stop renewing the port mapping, and it expires when its lifetime runs out.  Thus, I won’t forget to delete the mapping later (and reboot the router yet again) when I want that port secured behind the NAT.

Files

py-natpmp repository on Github. If you want the unpackaged source code, you can find the latest versions there.

py-natpmp-0.2.1.tar.gz – a proper Python setuptools package for py-natpmp, if you’re rather take a tarball.

(To enable NAT-PMP on the AirPort router, go to AirPort Utility.app -> Internet -> NAT -> Enable NAT Port Mapping Protocol. Requires Mac OS X 10.4 to succeed. Older AirPort utility software may have this option hidden elsewhere.)

The code is BSD licensed, so feel free to take it. I’d love knowing where this code ends up (just out of personal curiosity), if you drop me a line, but that’s quite optional.

Client

To use the client, grab it and the above library. Make sure you have the library in the same directory as the client script or otherwise on your Python instance’s sys.path. Invoke the client on the command-line (Terminal.app) as python natpmp-client.py [-u] [-l lifetime] [-g gateway_addr] public_port private_port.

For example:

python natpmp-client.py -u -l 1800 60009 60009
Create a mapping for the public UDP port 60009 to the private UDP port 60009 for 1,800 seconds (30 minutes)
python natpmp-client.py 60010 60010
Create a mapping for the public TCP port 60010 to the private TCP port 60010
python natpmp-client.py -g 10.0.1.1 60011 60022
Explicitly instruct the gateway router 10.0.1.1 to create the TCP mapping from 60010 to 60022

Remember to turn off your firewall for those ports that you map.

Library

The library provides a set of high-level and low-level functions to interact via the NAT-PMP protocol. The functions map_port and get_public_address provide the two high-level functions offered by NAT-PMP. Responses are stored as Python objects.

The code is fairly well-documented, so consult the NATPMP.py file for usage details.

Disclaimer

This is an incomplete implementation of the specification.  When the router reboots, all dynamic mappings are lost.  The specification provides for notification packets to be sent by the router to each client when this happens.  There is no support in this library and client to monitor for such notifications, nor does it implement a daemon process to do so.  The specification recommends queuing requests – that is, all NAT-PMP interactions should happen serially.  This simple library does not queue requests – if you abuse it with multithreading, it will send those requests in parallel and possibly overwhelm the router.

The library will attempt to auto-detect your NAT gateway. This is done via a popen to netstat on BSDs/Darwin and ip on Linux. This is likely to fail miserably, depending on how standard the output is. In the library, a keyword argument is provided to override the default and specify your own gateway address. In the client, use the -g switch to manually specify your gateway.

Conclusion

This is a relatively simple library that I am using in aiding my development work, and provides for port forwarding and NAT traversal in my personal Python apps.  It may also come in handy for you in your Python code — if nothing else, as a reference to see how to interact with the router via the protocol. It is not elegant/well tested, and not meant for daily use by a normal user.  If you use the library in your code and it does not work for your configuration, I would be happy to take bug reports (or even better, patches!).

The client is intended for demonstration purposes only, though I personally use it as a command-line sysadmin tool for when I’m away from my machine but need to open up a port.

If you are a typical user looking for port forwarding tools, this is not for you. For an actual, well-designed, user-friendly program to perform dynamic port mappings on the Mac, try the great shareware package Lighthouse, which provides a user-friendly menu bar extra, persistent port mapping profiles, and UPnP support for your non-Apple routers.

Updated Feb 10, 2010
– Repackaged with setuptools. Fixed major bugs on systems that fail to detect the gateway automatically. Experimental Windows 7 support.
Updated Feb 05, 2008
– Removed broken mutexes, pending work to implement a decent request queue. Allow me to reiterate that this library is currently non-compliant with the specification’s recommendation to queue NAT-PMP requests. Try not to abuse your router when using this library in multithreaded Python code.
– Added some gateway autodetection on NT systems via netstat. Thanks to roee shlomo for the regex.

Again, comments and patches (or better thought-out implementations) are welcomed and encouraged.

Extracting Address Book information

If you’ve left information in the Address Book in OS X, any other program on the machine can use Cocoa APIs to extract information automatically.

#import <Foundation/Foundation.h>
#import <AddressBook/AddressBook.h>
#import <AppKit/NSWorkspace.h>

void sayHello(void)
{
   ABPerson *currUser = [[ABAddressBook sharedAddressBook] me];
   NSString *firstName = [currUser valueForProperty:kABFirstNameProperty];
   NSString *lastName = [currUser valueForProperty:kABLastNameProperty];
   [[NSWorkspace sharedWorkspace] 
                           openURL:[NSURL URLWithString:
                                                     [NSString stringWithFormat:
                                                     @"http://www.google.com/search?hl=en&q=%@+%@", 
                                                          firstName,
                                                          lastName]]];
}

int main (int argc, const char * argv[])
{
   NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
   sayHello();
   [pool release];
   return 0;
} 

This will take the name labelled as the currently logged in user and hit up Google, placing the first and last name in the search box and run a search with it. Other pieces of information are also accessible via the AddressBook API.

As you can well see, this can be used for many purposes: nifty automagical hacks in the OS’s built-in suite of personal information and communcation apps, for example. But I can think of less beneficial uses for this API. You should be able to trust the programs you run.

duplicates and invoking LaunchServices

If you deal with duplicate files a lot, then fdupes is a very quick command-line solution. Among other things, it also does an MD5 comparison of file signatures, can recurse directories and following symlinks, and can be used for scripting as well as directly from standard input.


Bonaparte-Prime:/tmp $ fdupes -r -d ./
[1] ./foo.mp3
[2] ./bar/baz.mp3

Set 1 of 4, preserve files [1 - 2, all]:

However, sometimes I forget what the hell foo.mp3 and bar/baz.mp3 really were. Then I have to fire up the Finder (or another Terminal window) and go chase down the file. What I really wanted would be if there were an “open one of these files and let me see what it is, then I’ll make a decision” option.

The beauty of open source is that I can patch fdupes myself and scratch this itch.

Finding the right app to hand off to might be annoying. On OS X, if you want to use the nice automagical way that Finder opens files (meaning that it generally knows what applications are appropriate to open an arbitrary file), you’ll have to link against LaunchServices, a part of the ApplicationServices framework. The code to do that is actually quite trivial. Simply add the ApplicationServices.h header, and then a function such as:


OSStatus open_target_file(const char* filepath)
{
FSRef ref;
OSStatus err;
Boolean isDirectory;

err = FSPathMakeRef((const UInt8 *)filepath, &ref, &isDirectory);

err = LSOpenFSRef(&ref, NULL);

return err;
}

will do the job. Obviously this is simplistic and without error-checking, but you get the idea. LaunchServices will then handle the tedium of finding the right system application to launch the file.

The other half of the patch involves modifying fdupes’s command parser to take an additional variant. I chose a scheme by which if I precede the choice numbers I make with the character ‘o’, it should then hand off the filepath to the aforementioned open_target_file function and return to its previous state. Since the parser already has such a case to handle mistaken input, the patch is also quite simple. Simply check for ‘o’ in the first byte of the preservestr array at the appropriate place, and then should it match, have it execute open_target_file() rather than preserve[number] = 1; If you’re the portability stickler type, wrap these new lines under #ifdef __APPLE__ statements to check for OS X at compile-time.

Source code to simple utilities like these are really quite helpful. They let someone with just a modicum of programming ability tweak the behavior and address some common annoyances, without having re-invent the thing from scratch or to pester the original dev, who probably doesn’t care nearly enough about little features for specific platforms that only specific people might find useful. Write the extension code, create a patch, and keep your patch around in case the utility is ever updated. Scratch your own itch as free software intended.