A NAT-PMP client library for Python

Something that I’ve put together and might be somewhat useful to others. I wrote a NAT-PMP (Network Address Translation Port Mapping Protocol) library and testing client in Python. The client allows you to set up dynamic port mappings on NAT-PMP compatible routers. Thus this is a means for dynamic NAT traversal with routers that talk NAT-PMP. In practical terms, this is basically limited to the newer Apple AirPort base stations and the AirPort Express, which have support for this protocol. I’m currently unable to support UPnP (the dominant port mapping protocol in non-Apple routers), though I’m sure either someone has written bindings for Python, or I’ll have to do it eventually once my AirPort Express dies.

In any case, this library puts a thin layer of Python abstraction over the NAT-PMP protocol, version 0, as specified by the NAT-PMP draft standard.  The purpose of this is simple.  I needed to establish port forwarding without rebooting my AirPort router.

Normally, when you need to have a public port on your router forwarded to a private port on your box behind it (say, if you write a server that listens on a port, or have a P2P client running on a port) you’d have to fire up AirPort Utility.app -> Advanced -> Port Mappings, set your mappings, and then hit Update…which reboots the router, killing your network connection and anything you might be doing. There has been no good way to do this programmatically with AirPort routers, when your program needs to negotiate a port forwarding with no manual intervention. Further, there isn’t a command-line tool to open up a port when you’re SSH’ed into your machine but don’t have a GUI.

Creating the port mappings dynamically via NAT-PMP allows port forwarding to happen without the reboot. When my server runs, it can make a call to the router to start forwarding a port.  Furthermore, when I’m done with testing my server, I stop renewing the port mapping, and it expires when its lifetime runs out.  Thus, I won’t forget to delete the mapping later (and reboot the router yet again) when I want that port secured behind the NAT.

Files

py-natpmp repository on Github. If you want the unpackaged source code, you can find the latest versions there.

py-natpmp-0.2.1.tar.gz – a proper Python setuptools package for py-natpmp, if you’re rather take a tarball.

(To enable NAT-PMP on the AirPort router, go to AirPort Utility.app -> Internet -> NAT -> Enable NAT Port Mapping Protocol. Requires Mac OS X 10.4 to succeed. Older AirPort utility software may have this option hidden elsewhere.)

The code is BSD licensed, so feel free to take it. I’d love knowing where this code ends up (just out of personal curiosity), if you drop me a line, but that’s quite optional.

Client

To use the client, grab it and the above library. Make sure you have the library in the same directory as the client script or otherwise on your Python instance’s sys.path. Invoke the client on the command-line (Terminal.app) as python natpmp-client.py [-u] [-l lifetime] [-g gateway_addr] public_port private_port.

For example:

python natpmp-client.py -u -l 1800 60009 60009
Create a mapping for the public UDP port 60009 to the private UDP port 60009 for 1,800 seconds (30 minutes)
python natpmp-client.py 60010 60010
Create a mapping for the public TCP port 60010 to the private TCP port 60010
python natpmp-client.py -g 10.0.1.1 60011 60022
Explicitly instruct the gateway router 10.0.1.1 to create the TCP mapping from 60010 to 60022

Remember to turn off your firewall for those ports that you map.

Library

The library provides a set of high-level and low-level functions to interact via the NAT-PMP protocol. The functions map_port and get_public_address provide the two high-level functions offered by NAT-PMP. Responses are stored as Python objects.

The code is fairly well-documented, so consult the NATPMP.py file for usage details.

Disclaimer

This is an incomplete implementation of the specification.  When the router reboots, all dynamic mappings are lost.  The specification provides for notification packets to be sent by the router to each client when this happens.  There is no support in this library and client to monitor for such notifications, nor does it implement a daemon process to do so.  The specification recommends queuing requests – that is, all NAT-PMP interactions should happen serially.  This simple library does not queue requests – if you abuse it with multithreading, it will send those requests in parallel and possibly overwhelm the router.

The library will attempt to auto-detect your NAT gateway. This is done via a popen to netstat on BSDs/Darwin and ip on Linux. This is likely to fail miserably, depending on how standard the output is. In the library, a keyword argument is provided to override the default and specify your own gateway address. In the client, use the -g switch to manually specify your gateway.

Conclusion

This is a relatively simple library that I am using in aiding my development work, and provides for port forwarding and NAT traversal in my personal Python apps.  It may also come in handy for you in your Python code — if nothing else, as a reference to see how to interact with the router via the protocol. It is not elegant/well tested, and not meant for daily use by a normal user.  If you use the library in your code and it does not work for your configuration, I would be happy to take bug reports (or even better, patches!).

The client is intended for demonstration purposes only, though I personally use it as a command-line sysadmin tool for when I’m away from my machine but need to open up a port.

If you are a typical user looking for port forwarding tools, this is not for you. For an actual, well-designed, user-friendly program to perform dynamic port mappings on the Mac, try the great shareware package Lighthouse, which provides a user-friendly menu bar extra, persistent port mapping profiles, and UPnP support for your non-Apple routers.

Updated Feb 10, 2010
– Repackaged with setuptools. Fixed major bugs on systems that fail to detect the gateway automatically. Experimental Windows 7 support.
Updated Feb 05, 2008
– Removed broken mutexes, pending work to implement a decent request queue. Allow me to reiterate that this library is currently non-compliant with the specification’s recommendation to queue NAT-PMP requests. Try not to abuse your router when using this library in multithreaded Python code.
– Added some gateway autodetection on NT systems via netstat. Thanks to roee shlomo for the regex.

Again, comments and patches (or better thought-out implementations) are welcomed and encouraged.

download servers and the Web Developer extension


One nifty thing that the Web Developer extension for Firefox can do is live HTML editing, on the currently loaded page. The feature is activated via the toolbar, under the Miscellaneous button, via the item “Edit HTML”. It pops up a text box containing the current page’s HTML. Edit to your heart’s content, and hit the Apply button (the blue-with-green-arrow button beside the search box – not exactly the most obvious icon for “Apply”, but that’s a UI critique for another time). The current loaded web page will reflect your changes.

Obviously it will stick around only until you load some other page, since you are not actually editing the web page on the remote server itself. So how is this useful?

So MegaShares is one of those sketchy file hosting and download sites, akin to Rapidshare, MegaUpload, etc. I had a problem here where some some files are served from storage machine #21, which was apparently overloaded or just not configured right – it would start the download fine, but the download gradually stalls before completion. Wacky. There appears to be some redundancy, however, and I wondered if I can grab the file from another server by changing the machine number in the URL.

Unfortunately, as most of these places do, they prohibit direct access to a file without going through their UI, so I can’t just take the download URL, change the machine number, and pop it in the browser. I assumed they were checking referrers, so I spoofed the REFERER field. No luck.

You can see where I’m going with this. Enter the Web Developer extension. Used the Edit HTML feature to change the URL on the page directly, and clicked through the changed link. Success! Their script accepts this action, and the download starts from machine #3. Whatever referrer check or scripting magic that they use to enforce their no-direct-access policy is still intact, since the rest of the page has not changed.

Obviously this is a specific example – if there were no storage redundancy at MegaShare, this trick would have been useless. Nevertheless, it demonstrates the power of live-editing a loaded page, in your browser. Extensions like Greasemonkey is the pinnacle of this kind of editing, but for a once-off adjustment, one doesn’t really need the power of a full scripting environment like that.

Not quite a real Read/Write Web, but an interesting trick to keep in mind.

transcode compile error – preprocessor macros vs attributes

Had a problem compiling transcode 1.0.4 on my OS X Tiger (PPC).

In file included from /usr/include/math.h:26,
from pvnglobals.h:26,
from pvn.h:15,
from import_pvn.c:39:
/usr/include/architecture/ppc/math.h:179: error: parse error before '__attribute__'
/usr/include/architecture/ppc/math.h:179: error: parse error before 'inline'
/usr/include/architecture/ppc/math.h:180: error: parse error before '__attribute__'
/usr/include/architecture/ppc/math.h:180: error: parse error before 'inline'

… and so forth. The whole math.h business is a red herring; or rather, it’s a symptom, not the cause.

The issue appears to stem from a preprocessor macro problem in src/transcode.h. Around line 85 or so, find the block

#ifndef always_inline
#if defined(__GNUC__) && (__GNUC__ > 3 || __GNUC__ == 3 && __GNUC_MINOR__ > 0)
# define always_inline __attribute__((always_inline)) inline
#else
# define always_inline inline
#endif
#endif

A rather odd and potentially problematic thing to do, because now in math.h:83, we note:

#define __MATH_H_ALWAYS_INLINE__ __attribute__ ((always_inline))

which later gets used in math.h:179, 180, and so forth.

Now, because of the first #define, it seems we’re likely to cause __MATH_H_ALWAYS_INLINE__ to become something like __attribute__((__attribute__ ((always_inline)) inline)). Unlikely to compile cleanly.

One possible solution is to swap around the header orders so that math.h is included before transcode.h clobbers the always_inline definition.

Another workaround: comment out in the inner #if and preserve only the else-branch in transcode.h:

#ifndef always_inline
# define always_inline inline
#endif

It fixes the compilation problem, presumably because it no longer causes the always_inline definition to conflict with the one in math.h. Unfortunately, it also changes the meaning of __attribute__ ((always_inline)) in math.h. The first solution is preferable, even though it might be more tedious to trace the include blowups that result.

The moral of the story is that it’s usually bad to screw with built-in keywords via preprocessor macros.

GUI cues for block-level copy in Disk Utility

In Mac OS X 10.4 Tiger (PPC edition), the only UI difference between a block-level copy and a file-level copy when using Disk Utility’s Restore (a.k.a. disk cloning) mode is that the progress bar label reports "Copying Blocks..." for the former and just plain "Copying..." for the latter. The difference is significant, especially for full-disk cloning operations.

To invoke block-level copy, you:

  • must not be booted from either the source or the destination partition – the Mac OS X DVD is good for this, if you do not have a different partition.
  • must tick the checkbox for Erase Destination when setting for Restore mode in Disk Utility
  • may or may not need to select “Skip Checksum”.

Unfortunately, the progress bar message confirming whether a file-level or block-level copy is displayed after one has already invoked the Restore procedure, and there isn’t a “Cancel” button anywhere in sight. As if potential for hours of time wasted (and quite possibly loss of metadata, since asr in file-copy mode doesn’t bother preserving such trivial things as file creation date) isn’t good enough to warrant a button to express one’s regret.

One wonders why Apple did not simply put in a simple checkbox for “Block-level copy”, and warn you about the requirements for block-level copy (and further, asking you if you wish to proceed with file-level copy anyway if you don’t meet the requirements.)

Probably because they thought (in their infinite Apple wisdom) that it would scare and confuse the “normals” who can’t tell the difference. Never mind that the people who would try to clone their entire drives for backup (as opposed copying files here and there to a USB key) are probably savvy enough to care.

Wonder if this changed in Leopard.

Update
Many seem to arrive at this post wondering what a block-level copy is. Here it is in a nutshell. A typical hard disk is divided into a linear set of n logical blocks, m bytes each. In short, your files are recorded within these blocks. To keep track of things, the filesystem is responsible for maintaining more metadata on top of this. This lets it create such niceties as “folders”, and forms the overall tree of folders nested within folders that you see in Finder.

A file-level copy means that the copying program loads in the directory tree and walks the tree. When it finds a directory, it’ll load metadata to find all the files contained in it. When it finds a file, it’ll go look for the blocks that contain the file’s data, and start copying. This has a lot of overhead, since the program has to load in the nice tree abstraction first, descend into each folder looking for files to copy, and then go find the proper blocks to copy, and then finally copy the data and any metadata associated with it. A block-level copy, on the other hand, recognizes that if you literally want to copy everything from one disk to another, it’s a lot easier and faster to just start copying at block 1 until you get to block n at the end — rather than running up and down that directory tree.

A block-level copy is a literal byte-for-byte (well, one would hope) copy of one disk to another, while file-level copying creates a copy of each file and folder from one disk to another. The distinction here is subtle but important. A file-level copy from disk A to B does not necessarily result in A == B, while a block-level copy (for all intents and purposes) does. As for performance, for a full-disk clone, block-level copying should be dramatically faster than file-by-file copying.

Cisco VPN behind a NAT


Useful if you’re:

  • on OS X
  • using the Cisco VPN Client 4.9.01 or below
  • are behind a router/NAT
  • and having intermittent connectivity issues with the Berkeley Campus Full Tunnel VPN

You might also be able to use this info if you have a similar network setup and having similar problems, but I’m not going to claim that.

Basically, the problem for me was that three connections out of four would get an IP address from the VPN, but the actual network is unreachable. No IP can be ping’ed successfully. The VPN GUI reports “Bytes In: 0, Bytes Out: xxxx”. The VPN log is stuck in a loop of:

Sending DPD request to xx.xx.xx.xx, our seq# = 1234
...
Received DPD ACK from xx.xx.xx.xx, seq# received = 1234, seq# expected =
1234
...

The solution that I’ve found is to switch on Enable Transparent Tunneling -> IPSec over UDP ( NAT / PAT ). This can be done by hitting Modify on the GUI, for the appropriate Connection Entry. Then, use the Transport tab and tick on the appropriate box. For good measure, I also forwarded ports 500 and 4500 on my router’s NAT, to ensure that the conventional Cisco VPN ports are open to the network (and just to do some irrational voodoo). The IPSec over TCP option, btw, does not appear to work, despite what Berkeley IT say in the instructions page. The client refuses to connect with that option active, though in theory it should have worked. Perhaps I’m not forwarding the right ports for it.

In any case, finally, after 1.5 years of this nonsense, the Berkeley VPN doesn’t choke on me anymore (too bad I’ll be leaving here in 6 months. Argh.). Every connection I make gets through on the first time, rather than on the fourth or fifth time. It still doesn’t make sense how I was able to connect to the VPN before, though. Why would it fail intermittently, and not always?

This is why I am not a network engineer. It already gives me a headache.

Entourage sent-mail archival, episode 2


Previously, on The Sarth Repository

I had this setup going on to automatically redirect most messages I send to a repository for later search and retrieval…A month later, by pure chance, I realized that Entourage wasn’t quite deactivating the CC field on the [redirected] archival email. In essence, all the people I cc’ed on anything got spammed with a duplicate every time I sent a message… 

And now, the continuation…

So Google finally enabled IMAP for my accounts on thallos.org, which allowed me to test a new strategy for archiving sent mail. Again, the goal is to have a copy archived straight from Entourage, whenever I send a new email, to my mail repository. With proper IMAP access, however, this became much easier.

First, configure Entourage for IMAP access to Gmail / Google Apps. This is surpisingly non-trivial, since Entourage is not a supported client as of the time of this post. Rather strange, considering that Entourage must be at least second or third place in terms of install-base for Mac email clients. Follow the generic instructions for IMAP setup, and you should do okay. If you’re on Google Apps, the username is your_name@your_domain.tld, as per this configuration instruction.

You should have an IMAP structure for your Gmail boxes once this is complete. Simply set a rule in Rules -> Outgoing, for all messages, to copy the message to the Gmail/Sent Mail folder. In fact, this is the exact same approach if you were backing up to an IMAP-enabled mail server.

Unfortunately, It broke for me on a couple of messages. Gmail servers reported inconsistent failure messages, such as “Connection to the server failed or was dropped” and “The message could not be copied.” Some message headers also seemed to be mangled in transit, with the sender’s name dropped and so forth. The messages themselves were innocuous, text-only messages with no attachments, HTML, or any other random nonsense, so I find it very curious to be failing on these messages. Will have to look into it a bit more.

UPDATED Nov 22, 2007
See the exciting (yet depressing) episode 3 of my adventures in email archival.

Getting custom HTTP variables out of PHP

PHP 5.0 stores HTTP headers in the $_SERVER variable as key-value pairs. It mangles their field names, however, by:

  • prepending “HTTP_” to the key
  • replacing “-” with “_” in the key
  • uppercasing all letters

Say that your custom HTTP client sends X-Hello: World as a header. To retrieve the value (e.g. “world”) from PHP, the correct key to use is $_SERVER["HTTP_X_HELLO"].

This does fit with the existing access pattern (User-Agent: is retrieved by $_SERVER['HTTP_USER_AGENT']). But it was not well documented in corresponding page for reserved variables (as of today, October 7, 2007). Took a bit of trial and error for me to figure this out.

I’m sure that amongst the insanely numerous and ill-organized set of functions that PHP provides, there is one to do this exact task without reverse-engineering its key-mangling algorithm. But this way works too.

Entourage: thwarting archival strategies since 2004

So I use Microsoft Entourage as my main email client, and had been wanting for some time to get my messages exported out of my local drive. As much I trust my laptop and my backups, one good earthquake later and all of that would be futile.

Getting my message archives preserved (with all metadata intact, like Sent and Received dates, etc) was the easy part. Grabbing all future messages was the hard one. Of course, Microsoft, in its infinite wisdom, didn’t include an auto-bcc for Entourage.

I had this setup going on to automatically redirect most messages I send to a repository for later search and retrieval. I had a process set up where, except for select messages that I mark as confidential, the above rule gets triggered.

A month later, by pure chance, I realized that Entourage wasn’t quite deactivating the CC field on the redirect for archival. There is a bug that resends the message to all CC’ed emails on redirect. For example, if I were sending to [email protected], cc’ed to [email protected], and redirecting to [email protected]:

1. the first copy goes out to a and b.
2. Then, the redirected copy will be sent to archive and b, as b appears on the CC list.
3. End result: a receives 1 copy, b receives 2 copies, and archive receives 1 copy.

In essence, all the people I cc’ed on anything got spammed with a duplicate every time I sent a redirected copy via Entourage’s Outgoing rule. This is stupid, and Microsoft’s website doesn’t warn you about this. Try it for yourself if you don’t believe me.

Had I been more diligent at searching the web or even just testing out this archival strategy, this wouldn’t have happened. Plus, I would have noticed one fellow complaining that all contacts on the CC list, for every email, received a copy of his archived messages. Ouch. I’m glad I didn’t try redirecting all of my sent box (there is another strategy, which I will outline sometime, is far easier – but it can’t do real-time, auto-bcc).

To all the people whom I inadvertently spammed, I’m awfully sorry. This won’t happen again.

Windows IE 6 ignores text/plain mimetype

A fairly border-case scenario that probably rarely comes up, but appears to be another gotcha. So apparently IE 6 for Windows, on occasion, decides it knows better than the web server what format a file is. Instead of using the mimetype supplied by the web server, as all good browsers tend to do, IE performs some heuristics on the file and overrides the mimetype with its own guess. The type text/plain is one such stupid circumstance.

Annoyingly, IE will insist on downloading plaintext files in some cases, instead of rendering it in browser. This usually occurs if a script is attempting to generate a “text/plain” document on the fly, but can also happen under other circumstances if the IE hard-coded heuristics comes up with a different result than the server-proclaimed mimetype.

A client-side workaround for text/plain is possible. You’d need to edit the Windows Registry (oh joy). In HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings, add the DWORD key IsTextPlainHonored and set value to 0x1. This will make IE behave correctly for text/plain mimetypes. This solution comes per the MS Knowledgebase article, “Text/Plain” Content-Type Header Field Is Ignored. There are also some further explanations on how mimetypes are resolved in the MSDN article, on mimetype detection in IE.

Unfortunately, this is not a solution if this behavior comes up in a web-based tool for external use – as every client machine registry will have to be thus modified. This change may also carry security implications (actually, I’m completely guessing here, because I don’t quite see why the IE team decided to “not honor” mimetypes for text/plain…).

The context:

A PHP script in a project I maintain pulls a text file from a remote location, and then prints it to the browser as Content-type: text/plain. A hack to be sure, but simple enough to get the job done. This works out fine in Firefox, etc, but not in Windows IE. IE insists that this is a PHP script file that must be downloaded. Of course, once downloaded, you can fire up Notepad and see that it’s bloody plaintext. Firefox et al will render it in browser as expected.

In this case, the script was only used for internal testing, so I switched all the test machines to honor plaintext mimetypes. A longer term workaround would probably involve porting the output to XML instead.

Fixing FilePlanet’s stupidity on the Mac


Lately I haven’t been able to download files from fileplanet.com via my Mac. It’s inane, because downloads apparently requires an ActiveX control. I’m appalled at the utter stupidity of excluding all non-Windows platform users from your download service, just to set up a download queue. Can’t you put up a Flash control instead? Just as shiny and unusable, but actually compatible with other operating systems.

It gets better. The good news – the designer had some foresight to set up a fallback mechanism, to use plain old HTML queue. The bad news – it simply presents you with a 403 Forbidden when clicked.

As it turns out, I found a post that contained a possible solution. Actually, that post is a bit unnecessarily complicated. Apparently, they’re blocking all browsers without a Windows user agent. On the fallback solution that was supposed to work for all platforms. Argh.

Until Fileplanet addresses this issue (which could be tomorrow. or never), the simplest solution (that worked for me) was to switch my user agent in Firefox (via the aforementioned and highly recommended User Agent Switcher) to a Windows browser (try the default Opera XP user agent). Then, click on the fallback queuing link, and it should kick you into the download page.

Note that the User Agent option in Safari (at least, v2.0.x) Debug menu will not work straight up. Believe me, as a primarily Safari user, I’ve tried hard to make this work. Because Fileplanet pops up a new window when download is selected, and the Debug menu setting only sets it for the active window. As the download window is a pop-up, you do not get a chance to intervene and change the User Agent code before FilePlanet denies you access to it. So for now, Firefox + User Agent Switcher is the solution. If you have a browser (or Safari, in the future) which allows the fake user agent setting to persist across windows spawned from the initial window, that browser would work too.

UPDATE 3/27/2009:
Feedback in the comments section reports that this is still a problem for many users. Appalling. It’s been 2 years.