php 5.2.5 compile error – macro issue

Like the previously mentioned compile problem with transcode, PHP’s tidy extension appears to have a macro-induced collision problem on OS X Tiger. In particular the compile run blows up with:

In file included from /usr/include/tidy/tidy.h:70,
from /tmp/php-5.2.5/ext/tidy/tidy.c:34:
/usr/include/tidy/platform.h:515: error: duplicate 'unsigned'
/usr/include/tidy/platform.h:515: warning: useless type name in empty declaration

This exhibits a similar red herring, in that the compiler says a system include file (tidy/platform.h) is causing the issue. In actuality, this is most likely due to a preprocessor macro issue in PHP. In platform.h:515, we see:
typedef unsigned long ulong;

Now, in main/php_config.h:121:
#define ulong unsigned long

Since php_config.h is set to be included before platform.h, on this particular build configuration, platform.h:515 now becomes:
typedef unsigned long unsigned long;

Hence the compiler error message and the red-herring about your system include file (platform.h).

Since php_config.h is included by just about every main PHP source file, the easier solution is to switch the include order of tidy.h and the php default includes. As such, in tidy.c, we go from this:
#include "php.h"
#include "php_tidy.h"

#if HAVE_TIDY
...
#include "tidy.h"
#include "buffio.h"

to this:
#include "tidy.h"

#include "php.h"
#include "php_tidy.h"

#if HAVE_TIDY
...
#include "buffio.h"

You may also wish to wrap it with #if HAVE_TIDY and #endif, to preserve the original logic (note that tidy.h was originally within the #if), but in my case it seemed to have gone okay without it.

In my case, this compiled just fine with no further complaints. I don’t like doing this – perhaps the PHP-tidy devs had reasons for putting the includes in this order. But from a pragmatic point of view… hey, it compiles.

Again, the moral of the story: when screwing around with macros, try to avoid naming it something that will collide with system libraries.

Updated Feb 6, 2008
As jhardi notes in the comments section, the folks behind tidy have patched their latest version to work around this issue. Kudos to the tidy devs, and to the others who found this bug way before I even had to care about it.

Minor rant: so…this workaround requires that we upgrade tidy. Since Mac OS X doesn’t regularly update its Unix-y interiors , we’re left with the choice of overwriting an Apple system library, or shadowing it and remembering that we shadowed it. I picked option #1, since it’s unlikely that anything is going to blow up due to a new tidy library, but some people are understandably wary of overwriting system libraries (this is, in fact, one of the reasons why package managers like MacPorts stick copies in alternate directories instead).

This still leaves the current stable PHP not compiling with older tidy versions, on the Mac and any other platform using that typedef. And, as a bonus, there is a possibility that we might be doing this again if PHP ever decides to add another library that uses the word “ulong” somewhere.

That’s just lovely. I hope they address this in their future releases.

building a dynamic library on OS X

[UPDATE: 02/26/2010
As of Xcode 3.x, the -shared argument seems to be working on OS X to create shared objects. the -dynamiclib switch still works for creating dylibs. This post is left for historical curiosity.]


Some free software packages have an optional make target to build shared libraries out of their core application functions. Unfortunately, some of these packages are set up to compile for typical Linux shared objects. For example:

gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1 $(OBJ)

where $(OBJ) is your set of object files.

On Apple’s GCC (for Tiger and Leopard, at least), there is no -shared switch, and no -soname switch either. Compilation will fail at this step. The incantation to build a shared object would translate to something like:

gcc -dynamiclib -Wl,-headerpad_max_install_names,-undefined,dynamic_lookup,-compatibility_version,1.0,-current_version,1.0,-install_name,/usr/local/lib/libfoo.1.dylib -o libfoo.1.dylib $(OBJ)

headerpad_max_install_names will allow you to change the install path with install_name_tool later, should the install_name change during the life of the compiled object.

You must also ensure that your object files are compiled correctly for shared library usage. Usually this means compiling with -fPIC and -fno-common, depending on your code.

I’ve had to do this infrequently, which means I forget the syntax the next time and have to google for it again. Too many “cache-misses”, so to speak.

A NAT-PMP client library for Python

Something that I’ve put together and might be somewhat useful to others. I wrote a NAT-PMP (Network Address Translation Port Mapping Protocol) library and testing client in Python. The client allows you to set up dynamic port mappings on NAT-PMP compatible routers. Thus this is a means for dynamic NAT traversal with routers that talk NAT-PMP. In practical terms, this is basically limited to the newer Apple AirPort base stations and the AirPort Express, which have support for this protocol. I’m currently unable to support UPnP (the dominant port mapping protocol in non-Apple routers), though I’m sure either someone has written bindings for Python, or I’ll have to do it eventually once my AirPort Express dies.

In any case, this library puts a thin layer of Python abstraction over the NAT-PMP protocol, version 0, as specified by the NAT-PMP draft standard.  The purpose of this is simple.  I needed to establish port forwarding without rebooting my AirPort router.

Normally, when you need to have a public port on your router forwarded to a private port on your box behind it (say, if you write a server that listens on a port, or have a P2P client running on a port) you’d have to fire up AirPort Utility.app -> Advanced -> Port Mappings, set your mappings, and then hit Update…which reboots the router, killing your network connection and anything you might be doing. There has been no good way to do this programmatically with AirPort routers, when your program needs to negotiate a port forwarding with no manual intervention. Further, there isn’t a command-line tool to open up a port when you’re SSH’ed into your machine but don’t have a GUI.

Creating the port mappings dynamically via NAT-PMP allows port forwarding to happen without the reboot. When my server runs, it can make a call to the router to start forwarding a port.  Furthermore, when I’m done with testing my server, I stop renewing the port mapping, and it expires when its lifetime runs out.  Thus, I won’t forget to delete the mapping later (and reboot the router yet again) when I want that port secured behind the NAT.

Files

py-natpmp repository on Github. If you want the unpackaged source code, you can find the latest versions there.

py-natpmp-0.2.1.tar.gz – a proper Python setuptools package for py-natpmp, if you’re rather take a tarball.

(To enable NAT-PMP on the AirPort router, go to AirPort Utility.app -> Internet -> NAT -> Enable NAT Port Mapping Protocol. Requires Mac OS X 10.4 to succeed. Older AirPort utility software may have this option hidden elsewhere.)

The code is BSD licensed, so feel free to take it. I’d love knowing where this code ends up (just out of personal curiosity), if you drop me a line, but that’s quite optional.

Client

To use the client, grab it and the above library. Make sure you have the library in the same directory as the client script or otherwise on your Python instance’s sys.path. Invoke the client on the command-line (Terminal.app) as python natpmp-client.py [-u] [-l lifetime] [-g gateway_addr] public_port private_port.

For example:

python natpmp-client.py -u -l 1800 60009 60009
Create a mapping for the public UDP port 60009 to the private UDP port 60009 for 1,800 seconds (30 minutes)
python natpmp-client.py 60010 60010
Create a mapping for the public TCP port 60010 to the private TCP port 60010
python natpmp-client.py -g 10.0.1.1 60011 60022
Explicitly instruct the gateway router 10.0.1.1 to create the TCP mapping from 60010 to 60022

Remember to turn off your firewall for those ports that you map.

Library

The library provides a set of high-level and low-level functions to interact via the NAT-PMP protocol. The functions map_port and get_public_address provide the two high-level functions offered by NAT-PMP. Responses are stored as Python objects.

The code is fairly well-documented, so consult the NATPMP.py file for usage details.

Disclaimer

This is an incomplete implementation of the specification.  When the router reboots, all dynamic mappings are lost.  The specification provides for notification packets to be sent by the router to each client when this happens.  There is no support in this library and client to monitor for such notifications, nor does it implement a daemon process to do so.  The specification recommends queuing requests – that is, all NAT-PMP interactions should happen serially.  This simple library does not queue requests – if you abuse it with multithreading, it will send those requests in parallel and possibly overwhelm the router.

The library will attempt to auto-detect your NAT gateway. This is done via a popen to netstat on BSDs/Darwin and ip on Linux. This is likely to fail miserably, depending on how standard the output is. In the library, a keyword argument is provided to override the default and specify your own gateway address. In the client, use the -g switch to manually specify your gateway.

Conclusion

This is a relatively simple library that I am using in aiding my development work, and provides for port forwarding and NAT traversal in my personal Python apps.  It may also come in handy for you in your Python code — if nothing else, as a reference to see how to interact with the router via the protocol. It is not elegant/well tested, and not meant for daily use by a normal user.  If you use the library in your code and it does not work for your configuration, I would be happy to take bug reports (or even better, patches!).

The client is intended for demonstration purposes only, though I personally use it as a command-line sysadmin tool for when I’m away from my machine but need to open up a port.

If you are a typical user looking for port forwarding tools, this is not for you. For an actual, well-designed, user-friendly program to perform dynamic port mappings on the Mac, try the great shareware package Lighthouse, which provides a user-friendly menu bar extra, persistent port mapping profiles, and UPnP support for your non-Apple routers.

Updated Feb 10, 2010
– Repackaged with setuptools. Fixed major bugs on systems that fail to detect the gateway automatically. Experimental Windows 7 support.
Updated Feb 05, 2008
– Removed broken mutexes, pending work to implement a decent request queue. Allow me to reiterate that this library is currently non-compliant with the specification’s recommendation to queue NAT-PMP requests. Try not to abuse your router when using this library in multithreaded Python code.
– Added some gateway autodetection on NT systems via netstat. Thanks to roee shlomo for the regex.

Again, comments and patches (or better thought-out implementations) are welcomed and encouraged.

download servers and the Web Developer extension


One nifty thing that the Web Developer extension for Firefox can do is live HTML editing, on the currently loaded page. The feature is activated via the toolbar, under the Miscellaneous button, via the item “Edit HTML”. It pops up a text box containing the current page’s HTML. Edit to your heart’s content, and hit the Apply button (the blue-with-green-arrow button beside the search box – not exactly the most obvious icon for “Apply”, but that’s a UI critique for another time). The current loaded web page will reflect your changes.

Obviously it will stick around only until you load some other page, since you are not actually editing the web page on the remote server itself. So how is this useful?

So MegaShares is one of those sketchy file hosting and download sites, akin to Rapidshare, MegaUpload, etc. I had a problem here where some some files are served from storage machine #21, which was apparently overloaded or just not configured right – it would start the download fine, but the download gradually stalls before completion. Wacky. There appears to be some redundancy, however, and I wondered if I can grab the file from another server by changing the machine number in the URL.

Unfortunately, as most of these places do, they prohibit direct access to a file without going through their UI, so I can’t just take the download URL, change the machine number, and pop it in the browser. I assumed they were checking referrers, so I spoofed the REFERER field. No luck.

You can see where I’m going with this. Enter the Web Developer extension. Used the Edit HTML feature to change the URL on the page directly, and clicked through the changed link. Success! Their script accepts this action, and the download starts from machine #3. Whatever referrer check or scripting magic that they use to enforce their no-direct-access policy is still intact, since the rest of the page has not changed.

Obviously this is a specific example – if there were no storage redundancy at MegaShare, this trick would have been useless. Nevertheless, it demonstrates the power of live-editing a loaded page, in your browser. Extensions like Greasemonkey is the pinnacle of this kind of editing, but for a once-off adjustment, one doesn’t really need the power of a full scripting environment like that.

Not quite a real Read/Write Web, but an interesting trick to keep in mind.