macOS Safari displays blank page when accessing flask development server on localhost:5000

There is a very strange behavior in macOS Ventura 13.1 (and possibly earlier macOS as well), where the macOS Safari browser and the Google Chrome browser cannot access a Python Flask web development server on port 5000. The URL http://localhost:5000 either shows as completely blank, or responds with a 403 Forbidden.

Context

In short, Flask is a lightweight web application framework for Python. A typical Flask webapp is run in development mode as

flask --debug run

By default, this will start a Flask development server on port 5000:

* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit

From the command line, subsequently, curl can be used to reach the server

curl "http://localhost:5000/"

In my case, it responds with

Hello world

Note that localhost and 127.0.0.1 are, in typical cases, equivalent to each other. Both of them are pointing to the local machine.

Problem

If using the Safari browser on the same machine, though, and hitting http://localhost:5000, this will display a completely blank page. The development server itself will not show that a GET request was made, which suggests that the request never made it to the flask application at all.

If using the Chrome browser to hit http://localhost:5000, it will display a 403 Forbidden error instead of a blank page:

Access to localhost was denied
You don't have authorization to view this page.
HTTP ERROR 403

And again, the Flask dev server itself does not show a successful GET request being made.

As a final symptom, if instead http://127.0.0.1:5000 was used, a GET will be successfully received by the Flask dev server, and the server will respond to it normally.

Solution

  1. On macOS Ventura, go to Settings -> General -> AirDrop & Handoff and turn off AirPlay Receiver.
  2. Alternatively, instruct Flask to use a different port for its dev server other than 5000 and access that port instead:
    flask --debug run -p 8000
  3. Alternatively, always interact with Flask on the IP literal 127.0.0.1:5000 rather than the localhost DNS name. This is what the Flask instructions say to do anyway.

Discussion

It turns out there is a port conflict, of sorts. On macOS Ventura (and possibly Monterey as well), the OS’s built-in AirPlay Receiver is also listening on TCP port 5000 for some purpose. Running

lsof -i -P

...
ControlCe  423 yliu    7u  IPv4       0t0  TCP *:5000 (LISTEN)
ControlCe  423 yliu    8u  IPv6       0t0  TCP *:5000 (LISTEN)
...
Python    1374 yliu    5u  IPv4       0t0  TCP localhost:5000 (LISTEN)

we can see that ControlCenter (really AirPlay Receiver) is listening on TCP *:5000 and the Python Flask app is listening on localhost:5000.

In fact, if the Flask server is turned off, and a curl -i command (to show response headers) is issued to localhost:5000, this is the response:
curl -i "http://localhost:5000/"

HTTP/1.1 404 Not Found
Content-Length: 0
Server: AirTunes/670.6.2

I’m not sure why this situation isn’t causing a port conflict, throwing a socket error for “address already in use”, which would block the Python app from starting at all. Similarly, I don’t understand why it doesn’t affect curl when run on the command line, which still successfully reaches the Flask server.

Nevertheless, changing Flask’s socket to an unoccupied port such as 8000, or turning off the offending macOS component AirPlay Receiver that is using the port, addresses the conflict. Now both Safari and Chrome will be able send requests to the actual Flask server, rather than the AirTunes server also residing on port 5000.

Setting up OpenSSH Server on Windows 10 Anniversary Update

UPDATE 2017-12-22

Microsoft now ships a real SSH server (and client!) with Windows Fall Creators Update, no Cygwin or Linux subsystem required. Finally.

This post remains for historical context only.

Microsoft has finally landed its anniversary update for Windows 10. Among all the random useless features, is an actual Ubuntu Linux subsystem within Windows, with the ridiculously silly name “Bash on Ubuntu on Windows”. Goodbye Cygwin?

One of the first things I wanted to try was to setup the SSH server, so I can remote-login from my real box. Getting one up was actually fairly easy, if you can deal with a few problems with weird, weird red-herring error messages.

The problem

To recap, installing the OpenSSH server is as easy as popping open a bash shell (I assume you figured out how to get the Linux subsystem installed already and popped open Ubuntu )

sudo apt-get install openssh-server

If you try to now start the server with

sudo service ssh start

It’ll respond with

initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused

but subsequently

Starting OpenBSD Secure Shell server sshd [OK]

When you now try to connect in from a remote host, there are two outcomes:

  1. There is no SSH server on port 22.
  2. There is an SSH server on port 22, but it responds only to your Windows password and not your Ubuntu Linux user password. If login is successful, it drops you into a DOS prompt instead of a bash prompt. Trying to run bash within the prompt generates a response:
    "Error: 0x80070005".

Diagnosis

There are multiple underlying problems here.

  1. First, it’s important to note that there exists a separate Windows SSH server (separate from the OpenSSH server from the Ubuntu subsystem installation) now on port 22. If you telnet (yes, telnet) into port 22, and the host greeting is SSH-2.0-MS_1.100, you’ve run into the Microsoft SSH implementation (hence the MS part in the greeting).

    I’m not sure what this server is supposed to do (and, in fact, starting SSH servers on standard ports without explicitly telling the user seems like a potential security problem to me, in addition to being kind of jerk-ish). I do know it is launched when you reboot with Developer Mode on. You know, Developer Mode, the mode you have to turn on to run Bash on Ubuntu on Windows. This server is occupying port 22, so you cannot launch another SSH server to listen on this port. This is the problem that is causing the silent failure of sshd to start up, not a broken Upstart or any of that nonsense the error message being displayed is referring to.

  2. Apparently the Ubuntu compatibility layer on Windows does not implement chroot or one of the related system calls needed for OpenSSH privilege separation at the time of this post.
  3. If you launched bash from DOS cmd.exe, instead of the Bash on Ubuntu on Windows shortcut, /mnt/c/Users/[Windows Username] (the directory you start with in DOS) is not your Linux home directory. .ssh config files left in this directory will not configure your OpenSSH server.

Symptoms & Solutions

  1. To resolve the problem of conflicting SSH servers, set your OpenSSH server on a different port : one other than 22. This can be done by editing /etc/ssh/sshd_config and changing the Port configuration. Be sure to open this new port in your Windows Firewall for inbound connections, as this firewall configuration will not be automatically done for you. I set mine to 60022.
  2. To enable login, also change UsePrivilegeSeparation in the same config file to No. Failure to do so will cause the server to respond with Connection reset by nnn.nnn.nnn.nnn
  3. To enable public key auth and set other configurations, the Linux subsystem home directory is, as in normal Ubuntu, /home/<yourname>.

Having done this, now you can start or restart in the bash shell

sudo service ssh restart

And connect from remote host with your Ubuntu user credentials and/or public key:
$ ssh -p 60022 10.0.1.208
...
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 3.4.0+ x86_64)

(And if you telnet, the host message should be something like
SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.7
)

restoring Safari preferences from backup files in OS X Mavericks

Recently I had the misfortune of having to restore some Safari settings from backup, on OS X 10.9 Mavericks. I have done this many times before on older OS X versions, without incident — simply pull the various preference files such as com.apple.Safari.plist from backup and replace the damaged/unwanted ones. Takes all of 2 minutes, and years ago, I had already wrote a shell script to do exactly that.

It turns out that after Mavericks, Safari is incredibly resistant to conventional methods of preference file backup and restoration. Considering that preferences in OS X have always been stored in XML-based preference lists, you would think (as in previous operating systems from 10.0 to 10.7) taking the relevant preference files and replacing the unwanted new ones in ~/Library/Preferences would be enough. But no, an incredible amount of effort is now required for a simple task:

  1. unhiding the Library directory, because clearly we’re all Windows babies who cannot be trusted to see where application preferences are stored
  2. replacing the actual preference files, scattered around the system in ~/Library/Safari, ~/Library/Preferences, ~/Library/Caches, and hopefully remember to have turned off iCloud, or otherwise see changes being clobbered by iCloud sync (or, worse yet, having experimental changes reflected all across a network of Mac and iDevices)
  3. resetting the preference cache daemon, cfprefsd so that preference changes can be reflected in a running system. This where a lot of people get stuck in general, judging by Google results; when they replace preference files and find that their changes aren’t being reflected, it leads them into a wild goose chase for “hidden preference files” for Safari, when the answer lies in a simply yet utterly non-obvious background daemon.
  4. restoring the list of installed extensions — which, incredibly, is NOT stored in the deceptively named com.apple.Safari.Extensions.plist, but in the login.keychain.

Context

Had some issues where certain websites were behaving differently under private browsing mode than normal browsing mode. I deduced there was some kind of corrupted stored state, whether it was a cookie or localstorage issue. Had the brilliant idea of setting Safari preferences aside, thus resetting Safari to factory state, and then divide-and-conquer by restoring parts of the settings until the problem recurs. I’ve done this many times before.

First I turned off iCloud sync, having been bitten by sync propagation of experimental changes in the past. This is pretty important if you don’t want to blow up Safari bookmarks (at the very least) across all Apple-manufactured, iCloud-compatible devices. I then removed ~/Library/com.apple.Safari.plist, ~/Library/com.apple.Safari.Extensions.plist, ~/Library/Safari, and ~/Library/Caches/Safari, ~/Library/Cookies. After resetting to confirm some issues have disappeared, I moved some files from backup to original locations. Imagine my surprise when nothing became restored, and all my Safari extensions (installed from the extension store or custom-built by me) disappeared.

Process

Increasingly desperate, I started to trace filesystem accesses using fs_usage. It showed nothing out of the ordinary. 30 mins of reviewing useless forum posts later, I pieced together a multi-stage solution. It turns out there were two separate obstacles.

Preference caching

Presumably to save energy, OS X Mavericks caches application preferences (in RAM?) using a daemon called cfprefsd. Instead of applications pulling their preferences from XML files on disk at launch, it requests this from the daemon instead. The defaults command has been modified to operate with this daemon, so if you had been working with preferences from the command line (as I have been), the changes have been transparent.

However if the preference files are changed or edited directly, this change is not propagated to the preference cache daemon. When the app is opened again, the cached version takes precedence, and is re-written out to disk, clobbering the restored versions.

This does not mean there are hidden Safari preferences somewhere that you haven’t found, though you might think this at first. When Safari is reset manually from the filesystem, or if the plist files are edited, cfprefsd must be reset as well.

There exists a cfprefsd daemon for every logged in user, running under that user’s privileges, as well as a root-owned one. Safari preferences are stored under the user domain, so the user-specific daemon is the one that needs a reset when files change. Can also quit the process from Activity Viewer, or killall cfprefsd. A login-logout cycle would also reset the user-specific daemon.

Extension list caching

Having done this reset with the backup files in place, most preferences will be restored on next launch, *except* the list of extensions you had installed previously. That will remain empty. Even though all the extensions and their settings have been restored to ~/Library/Safari.

For a long time I traced ~/Library/com.apple.Safari.Extensions.plist, and wondered why it wasn’t being read.

An Apple discussion forum post (shockingly enough) gave a vital clue. There exists an “application password” in the login keychain titled “Safari Extensions List”. Whether it is merely a cryptographic key, or the actual list of extensions, is unknown, but that is the critical preference to restore extensions. Having reset extensions by moving them away, this preference is apparently emptied out. The entire login keychain, being an encrypted document, has to be restored to a corresponding previous version to restore access to previous extensions. Without this, all extensions would have to be reinstalled manually (and get a new copy of the extension file stored into ~/Library/Safari/Extensions, instead of the previous version being reused).

Discussion

Given recent focus on energy consumption, I can understand preference caching. However, it’s not that hard to track filesystem changes (the Time Machine/Spotlight APIs explicitly do this!) and reload appropriate preferences when they are changed on disk. It would show respect for power users and developers who might need to interact with the preferences system in a more convenient way.

But stuffing extension lists in an obscure corner of a password keychain? What sense does that make? Are my list of extensions (not actual extension data or settings, mind you — those are in plaintext on the disk for anyone to copy and look at) such privileged information that it has to reside alongside my login password? Why can’t you just read the list of extensions, oh I don’t know, from the list of extension files installed into the Extensions directory? Wouldn’t that be a lot more reasonable?

Upgrading the Seagate Barracuda 7200.11 to firmware SD1A

TL;DR: If you’re applying firmware upgrade SD1A to Seagate drives, you need to double-check the firmware actually applied properly. If the Seagate patcher doesn’t work, make sure to use Legacy mode on SATA in the BIOS, instead of the more modern AHCI mode.

So perhaps you have heard of Seagate’s little manufacturing issue with its internal 3.5-inch Barracuda 7200.11 1TB drives a while back — namely, that some drives shipping with SD15 firmware are dying horribly. I had the unfortunate experience of buying such a hard drive — the ST31000340AS — as a scratch disk for my main machine, a MacBook Pro with a mere 240 GB internal drive (a pre-unibody revision, where the HD is insanely difficult to replace).

Seagate did in fact issue a firmware update — SD1A — that supposedly addressed this issue, but of course, there’s one catch: you can’t install the firmware through an external drive enclosure. In communication with Seagate support, a representative confirmed that for those of us without a desktop tower that has a SATA bay, we’re hosed:

Unfortunately, due to the nature of firmware updates and the way external drives work, the firmware update program cannot directly communicate with the drive in the manner it needs to in order to be able to upload the new firmware to the drive. It must be plugged into an internal SATA controller in order to update the drive.

Fair enough. That makes technical sense — but of course, it doesn’t work for me. I asked whether they would handle a mail-in repair, given that I have no easy access to such a desktop. The answer, of course, is No.

I have to find a desktop, open it up, jam this baby in (possibly in place of the existing drive if there’s only one bay), update the firmware, and put everything back together. Sadly, most of my friends who still own desktops would not trust me that far.

Half a year passes, and I finally find a sucker good friend who’s gullible awesome enough for me to try this procedure on his machine. The fellow owns a nice if aging Dell Precision T5400, which comes with two SATA bays (so I don’t have to inflict undue harm onto the existing system). Since this thing can run two drives at once, I can use the first method (a Windows-based firmware updater), though I burned a boot CD for the second method just in case. I popped in the drive, fired up Windows XP, downloaded the Windows-based Firmware Update Utility, double-clicked, and thought it was the (triumphant) end. In fact, it took 3 hours of my life to find out just how deep this rabbit hole goes.

Problem 1: The lying updater

The firmware updater will give a bunch of scary warnings and then reboot the machine. It will automatically reboot to a Seagate Loader screen, which attempts to apply the patch to all eligible SATA drives. To its credit, it’ll skip the non-qualifying (i.e. non-Seagate, non-Barracuda, etc.) drives, but it’ll still try them out first. At the end of the process, it will report “firmware downloaded” and “SUCCESSFUL” or some variant thereof, and automatically reboot back into Windows.

At this point, I advise you to use the SeaTools utility to verify that the firmware update actually applied. Despite its claims, if you were on a stock setup Dell T5400 (or perhaps other models as well), this will prove that the updater is a lying scumbag. And in fact, this particular drive still reported firmware SD15, the broken one.

Problem 2: The broken Boot CD

To save both me and my gracious host (who’s starting to suspect my computer-fixin’ skills now) some time, I decided to try the boot CD method, rather than pounding my head trying to see why the updater was lying. I downloaded the boot CD from the same Seagate Support site above, burned it to disk, and tried it out.

The result is a new SelfSolved posting: SelfSolved #59: getFatBlock error when upgrading Seagate Barracuda 7200.11 firmware. In essence:

The FreeDOS boot CD reports a number of ” error reading partition table drive 01 sector 0 ” errors. This is followed by ” get Fatblock failed:0x000000e8 ” or some variant of ” getFatBlock failed : ” The FreeDOS boot process appears to stall at this stage, and does not continue to the firmware flasher program.

That was lovely.

The Solution

I chased some red herrings. I came across postings about failures in various FreeDOS-based Seagate tools. One such post mentioned that it took a long time for the boot disc to get over the “error reading partition table” errors, but I waited forever (well, 15 minutes) and the boot process did appear to be frozen / stalled. I reformatted the drive via diskpart clean, thinking that the getFatBlock and error reading partition errors were related to a non-MBR partition table (I had it set to GPT). I should have realized, of course, that the errors were completely unrelated to filesystems, despite the “fat block” to which it refers.

The actual solution is deceptively simple — the boot disc & flasher appears to handle AHCI-based SATA mode badly. The Dell I was using was set to AHCI mode, out of the three possible Legacy, AHCI, and RAID options for SATA. Apparently the boot disc simply doesn’t handle this mode correctly on the Dell machine (and may also be related to why the Windows-based updater lies). When the machine switches on, use F12 to enter the boot menu, and select Setup to enter the BIOS. Then, on the list of Drive Options, skip past the SATA drives and down to SATA options. Pick the Legacy option to use ATA mode, instead of AHCI. Once this is done, the boot disc will function correctly, and the updated firmware will be applied without incident. Remember to switch the mode back to AHCI — it’s default for reason, no doubt.

The “error reading partition” messages were completely red herrings. They appear whether you are in the right SATA mode or not, and does not appear to affect the operation of the firmware updater or the boot process. It should not take very long to get to the flasher on this particular setup, so don’t wait on that message too long — it’s a good sign something’s not quite right.

In the end, I did recover my $100 hard drive, and the confidence of my peer in my mad hardware skillz (actually, quite non-existent).

Discussion

In the end, I’m quite appalled at Seagate. This sort of failure shouldn’t have happened, of course. Once it did, Seagate should have offered to take back and replace broken drives — the data I had on there was non-critical. I would have been perfectly willing to pay shipping costs to get a fixed replacement through mail-in service. I should not have been forced to search my social network for a person willing to let me tear his desktop computer apart, for a dubious and unsure firmware update procedure that fails mysteriously. I spent an additional 3 hours tracing mystery failures, for which the error messages were rather useless. Without my trusty iPhone and access to the Internet, I would not been able to solve this problem. How should I have known what “getFatBlock failed” means?

This little episode has convinced me to never buy a Seagate drive again — I simply cannot afford the time and energy for these sort of firmware upgrade adventures. While I was looking for a desktop to tear apart, I bought a Western Digital Caviar Black 1TB drive instead. Another $100, but at least I had a scratch drive for my work.

The moral of the story: Seagate, you are the worst storage vendor I’ve had to work with so far. I hope this record is not broken in the future.

Subversion 1.6.2 runtime error on network access on OS X 10.5

A new SelfSolved solution is up for perusal. The problem I tried to solve:

After compiling Subversion 1.6.2 from source on OS X 10.5 Leopard, the compilation is apparently successful, but svn dies when it tries to connect to the network for the first time. Crash log reports that symbols are missing from libneon.dylib.

Crash report from shell:

dyld: lazy symbol binding failed: Symbol not found: _ne_set_connect_timeout
Referenced from: /usr/local/lib/libsvn_ra_neon-1.0.dylib
Expected in: dynamic lookup

dyld: Symbol not found: _ne_set_connect_timeout
Referenced from: /usr/local/lib/libsvn_ra_neon-1.0.dylib
Expected in: dynamic lookup

Check out the places that I googled and my final solution writeup … at SelfSolved #49: Subversion 1.6.2 explodes on first network access.

The problem is very similar to a previous compilation issue I solved for PHP. In essence, the -L library search path passed to GCC at compilation time has /usr/lib in front of everything else. This means whatever library path you might have given to it at configure time, it’ll always look for the library in /usr/lib first, picking up the old system libneon in the process. Since the bad libneon dynamically linked, the problem doesn’t manifest itself until runtime — and only at runtime with network access involved.

As with the PHP issue, change the very first -L/usr/lib to -L/usr/local/lib (or wherever your newer libneon is located), and it’ll link correctly.

Out of curiosity, I checked MacPorts first. The MacPorts solution of disabling libneon version checking is odd — it also works, but I dunno if it’s linking to the right thing or not.

Fixing undefined library symbols for compiling PHP 5.2.8

So while compiling PHP 5.2.8 on OS X 10.5, you might run into something like:

Undefined symbols for architecture i386:
  "_xmlTextReaderSchemaValidate", referenced from:
      _zim_xmlreader_setSchema in php_xmlreader.o
  "_xmlTextReaderSetup", referenced from:
      _zim_xmlreader_XML in php_xmlreader.o
ld: symbol(s) not found for architecture i386
collect2: ld returned 1 exit status
Undefined symbols for architecture x86_64:
  "_xmlTextReaderSchemaValidate", referenced from:
      _zim_xmlreader_setSchema in php_xmlreader.o
  "_xmlTextReaderSetup", referenced from:
      _zim_xmlreader_XML in php_xmlreader.o
ld: symbol(s) not found for architecture x86_64

This doesn’t only happen with libxml. If you’ve installed any extra updated libraries, like iconv or tidy or any library that has significant symbol changes between versions, it’ll die in similar ways. The MacPorts folks have encounted similar issues in ticket 15891, but WONTFIX‘ed the issue. Apparently the PHP devs are also punting on the problem.

The immediate cause is that you have multiple versions of some shared libraries. For example, in the case above, I have two libxml versions — one in /usr/lib, and another in /usr/local/lib. This is because I do not want to overwrite the Apple-provided libxml version, but still needed new features provided in later libxml versions. The arrangement works fine in every other software compile except this one, so I investigated further.

The root of the problem

Despite the developers’ airy dismissal of the issue, the underlying problem is indeed that the Makefile generated by PHP at configure time is slightly broken. In Makefile and Makefile.global, you’re going to see this line:

libs/libphp$(PHP_MAJOR_VERSION).bundle: $(PHP_GLOBAL_OBJS) $(PHP_SAPI_OBJS)
        $(CC) $(MH_BUNDLE_FLAGS) $(CFLAGS_CLEAN) $(EXTRA_CFLAGS) $(LDFLAGS) $(EXTRA_LDFLAGS) $(PHP_GLOBAL_OBJS:.lo=.o) $(PHP_SAPI_OBJS:.lo=.o) $(PHP_FRAMEWORKS) $(EXTRA_LIBS) $(ZEND_EXTRA_LIBS) -o $@ && cp $@ libs/libphp$(PHP_MAJOR_VERSION).so

where $MH_BUNDLE_FLAGS is usually defined as something like

MH_BUNDLE_FLAGS = -bundle -bundle_loader /usr/sbin/httpd -L/usr/lib \
 -L/usr/lib -laprutil-1 -lsqlite3 -lexpat -liconv -L/usr/lib -lapr-1 -lpthread

The problem is that this hardcodes the search paths for linking shared libraries. GCC searches for shared libraries to link in the order of the provided -L paths. In this case, MH_BUNDLE_FLAGS is expanded immediately after $CC — so the load order is:

  1. /usr/lib
  2. /usr/lib (these are redundant, and so will probably be collapsed into one path)
  3. …every other custom library path you specify

Now you see the issue. No matter what your library paths are set to, the PHP compilation system will insist that whatever shared libraries in /usr/lib take precedence. Therefore, even if you specified that another version (say, libxml.dylib in /usr/local/lib) should be used instead, the invocation to link against -lxml2 will search in /usr/lib first. And since it finds the old version, which may be missing a number of symbols, the compilation blows up right there.

Evidence

And indeed, if you look at the (rather long and massive) compilation/link command right before it fails, you’ll see:

gcc -bundle -bundle_loader /usr/sbin/httpd -L/usr/lib -L/usr/lib \
-laprutil-1 -lsqlite3 -lexpat  -liconv -L/usr/lib -lapr-1 -lpthread -O2 -I/usr/include -DZTS   \
-arch i386 -arch x86_64 -L/usr/local/lib ... 

emphasis mine, where /usr/local/lib might be /opt/lib or whatever custom path you provided to configure.

Solutions

The trivial solution is to manually invoke that last line of compilation, but swapping the -L load paths.

gcc -bundle -bundle_loader /usr/sbin/httpd -L/usr/local/lib -L/usr/lib \
-L/usr/lib -laprutil-1 -lsqlite3 -lexpat  -liconv -L/usr/lib -lapr-1 -lpthread -O2 -I/usr/include -DZTS   \
-arch i386 -arch x86_64  ... 

This is easy to do and takes just a second.

Another possible solution is to patch the Makefile, such that MH_BUNDLE_FLAGS comes later in the compilation line:

libs/libphp$(PHP_MAJOR_VERSION).bundle: $(PHP_GLOBAL_OBJS) $(PHP_SAPI_OBJS)
        $(CC) $(CFLAGS_CLEAN) $(EXTRA_CFLAGS) $(LDFLAGS) $(EXTRA_LDFLAGS) $(PHP_GLOBAL_OBJS:.lo=.o) $(PHP_SAPI_OBJS:.lo=.o) $(PHP_FRAMEWORKS) $(EXTRA_LIBS) $(ZEND_EXTRA_LIBS) $(MH_BUNDLE_FLAGS) -o $@ && cp $@ libs/libphp$(PHP_MAJOR_VERSION).so

This will force your library paths to be searched before /usr/lib, thus resolving the link problem.

update 7/18/09
An anonymous reader mentions that you could also specify the right libxml by full path, instead of letting it use -lxml. Basically, in the last compilation line, you would remove any mentions of -lxml and replace that with the full path to your library e.g. /usr/local/lib/libxml.dylib. In fact, this is probably the way that has the least possible side-effects, since you aren’t changing the search order for any other libraries.

Discussion

This is not the first time that PHP core developers have refused to fix a compilation issue that is arguably preventable through actual testing under different installation scenarios. This is an “edgier” edge case than the tidy.h issue, but still should be fairly noticeable for a substantial number of people.

The “You should only have one library installed” argument is, to be honest, unnecessarily arrogant (sadly, not as a rare a problem as you’d like in some open source development projects ). I understand that it’s an open source project, and no self-respecting software engineer likes to use time on project plumbing / build systems rather than work on the product. However, on OS X, due to the lack of official Apple package management systems, no one should be overwriting system default libraries — down that way lies insanity, especially at every system or security update. PHP’s build system is obviously broken any time there is a substantial difference between user-installed libraries and system libraries. This bad behavior is especially egregious, because the configure command allows you specify your own library path — misleading users into thinking that the path they specified would be obeyed at compile time. If you only intend for the system library to be used and no other, perhaps the configure script should auto-detect this on OS X and disable that command-line option. Basic user interface design should apply even to command-line interfaces.

Note that changing link ordering may have some unforseen consequences, since the devs obviously never tested this path. For example, you should make sure the dynamic libraries are loaded in the right order at runtime. On OS X, the load path is typically hard-coded into the dylib, so usually there won’t be a problem — but there may be edge cases. Test your build (and any PHP extensions you built) before using it in production!

using ffmpeg for cutting media files – and the gotchas involved

So here I was, trying to complete some ordinary transcoding / media file cutting tasks with ffmpeg. Turns out there are some weird gotchas when using some versions of ffmpeg (in my case, SVN-r10571) on the commandline.

Problem 1: transcoding to mp3 insists on using 64kbits/s or 128kbits/s bitrate

So a simple task is to transcode an audio file of some arbitrary format to MP3. In my case, I only wanted a 30-second piece of the original file, converted to MP3. The original was in 256kbits/s, and I decided to use the same bitrate for the output, just for kicks. Reading the man page, -b is for video bitrate, and -ab is for audio bitrate. So I executed:

wrong:  ffmpeg -ss 00:00:30.00 -t 25 -i foo.mp3 -ab 256 foo-new.mp3

The output, of course, was this:

Input #0, mp3, from 'foo.mp3':
...
mdb:109, lastbuf:0 skipping granule 0
size= 393kB time=25.2 bitrate= 128.0kbits/s
video:0kB audio:393kB global headers:0kB muxing overhead 0.007950%

128 kbits/s. That was…not what I wanted.

Turns out, as the ffmpeg man page hints but does not specify clearly, a “k” is required to label the units of the new bitrate. As in:

right:  ffmpeg -ss 00:00:30.00 -t 25 -i foo.mp3 -ab 256k foo-new.mp3

This time, the output was:
Input #0, mp3, from 'foo.mp3':
...
mdb:109, lastbuf:0 skipping granule 0
size= 782kB time=25.0 bitrate= 256.0kbits/s
video:0kB audio:782kB global headers:0kB muxing overhead 0.003996%

Much better. It’s very strange how that if I don’t give it the ending “k” but do give it a higher value than 64, it always bumps up the bitrate to 128kbits/s, from the default of 64k (but not to the number I actually wanted).

 

Problem 2: cutting media file without re-encoding

So ffmpeg can be used to cut a media file, without reencoding the media stream. You simply pass the raw copy codec to the -acodec (or the -vcodec for video) at encode time. For my MP3, I thought it was a pretty trivial problem, so I issued:

wrong: ffmpeg -i bar.mp3 -ss 00:00:30.00 -t 25 -acodec copy bar-new.mp3

This seemed to process correctly, except it created a 555-byte empty file with no content in it. What’s even weirder, if you issued:

wrong: ffmpeg -i bar.mp3 -ss 00:00:10.00 -t 25 -acodec copy bar-new.mp3

That is, -ss 00:00:10.00 to seek to the 10th second, and -t 25 to record 25 seconds worth of audio. Strangely enough, the output file had 15 seconds of audio, the subtraction of 25 by 10. Curiouser and curiouser. Now,

wrong: ffmpeg -i bar.mp3 -ss 00:00:10.00 -t 25 -acodec mp3 bar-new.mp3

does in fact create a 25-second MP3 file. Only, of course, a re-encoded one at 64kbits/s. Something that we are not looking for. But strange.

A post on ffmpeg-devel finally provided enough hints to clue me into the problem. Unlike many command line apps, the order of arguments passed to ffmpeg seem to be silently significant. The correct incantation is:

right: ffmpeg -ss 00:00:30.00 -t 25 -i bar.mp3 -acodec copy bar-new.mp3

Note the switched order for the arguments -i and the args -ss and -t, where the -i must follow the other two. Now, the desired 25-second file, cut from the original, is correctly produced.

When arguments are missing some arbitrary text or in the wrong order, ffmpeg doesn’t sanity-check or warn you of these…it just silently proceeds and does some very strange things. Things that make you scratch your head and wonder, “wtf? did I mistype an argument somewhere?”

Ah, and of course, you make me have to track down all these little idiosyncrasies, wading through blog posts and mailing lists and forums. Agh.

Cannot populate a select element under IE 6 using the innerHTML property

On the other hand, here is an obvious bug in IE 6 that I managed to run into. In IE 6, if you attempt to dynamically change the contents of a <select> element in Javascript, via assigning a new string containing <option> elements to its innerHTML property, the select element will not be populated. In fact, if you output the string after the assignment, it will be truncated and malformed.

This is documented via Microsoft KB article 276228. Microsoft’s proposed fix, in abstract: “don’t do that”.

So I was trying my hand at some fancy AJAX to dynamically populate a select dropdown menu based on a previous select menu. An xmlhttprequest is triggered onChange of the first menu. We hit up an API, parse and transform the response into this HTML fragment consisting of <option> tags and values, and assign to the innerHTML property. Simple and quick. Works on Safari, Firefox…and of course, not on IE 6.

Some time wasted later, an alert() on the innerHTML property shows that in fact, the string there is malformed. The first <option> start tag is truncated from the innerHTML. No wonder it doesn’t work; it’s malformed. And it wasn’t, to start with, when my transformation finished and delivered the final string.

The apparent solution is to follow Microsoft’s advice on that page and do something else:
– assign to the options collection
– workaround using outerHTML

I’m sure there’s a way to jury rig it so that IE is fooled into concat’ing an “<option>” back onto the innerHTML, and disabling this workaround for other user agents. But that’s rather inelegant, no? I opted to use the options collection, which seems a reasonable (if slightly more complicated) method than just a simple assignment to innerHTML.

The KB article lists this as a problem with IE 5, but it recurs in IE 6. Does it persist in IE 7?

IE 6 renders a blank page on XHTML-style script end tag

10-01-2011: And the world slides backwards. I believe all major browsers, including the latest Firefox and Safari, now have this behavior. If you see a blank XHTML-served-as-HTML page in Safari or Firefox, check the script tags and make sure they are not self-closing: always use <script> ... </script>

On IE 6, a well-formed and validated web page may be rendered as a blank page if you close <script> tags in XHTML style. As in, <script type="text/javascript" ... src="foo.js" />, rather than the HTML style <script type="text/javascript" ... src="foo.js"></script>

So one of my web pages renders great in Safari and Firefox, but in IE 6, it is a completely blank page, devoid of content. Puzzled, I ran it through the W3C validator – no problem at all. Selected a View Source in IE, and noted that the entire HTML output looked OK.

Eventually I narrowed down the problem to a <script> tag in the markup. Namely, a <script type="text/javascript" ... src="foo.js" /> kind of tag. IE rendered the page when I removed the tag, and goes blank when I put it back. Curiously enough, I hadn’t actually invoked any functions from that .js file, so it was definitely not any code I was executing. Replacing the .js file with a dummy .js file also triggered the blank page. Changing or omitting the other attributes did not help.

The problem is fairly obvious now. When I close the tag in HTML style, with an actual </script> tag, IE proceeds to render just fine.

The obvious conclusion is that IE is buggy, but that may not necessarily true (well, in this one instance anyway). Despite most pages’ “compliance” with XHTML, DOCTYPE’ed and all, most web servers still serve these “XHTML” files as mimetype text/html instead of the recommended application/xhtml+xml. This is pragmatic, since IE 6 doesn’t even bother to render application/xhtml+xml, and user agents are required to stop rendering upon encountering non-valid markup (imagine the chaos that would cause).

However, it seems this might introduce a cause for the gotcha. Interpreted in actual text/html mode, one might imagine that to a HTML parser, <script .... /> doesn’t really appear to close the <script> tag at all – in fact, it might merely look like a rather malformed start script tag and no end tag. If I were a dumbly compliant parser+renderer, I might just start walking down the response string looking for that mythical end to this start tag. And end up rendering nothing. Of course, if I were a slightly smarter parser, I would look for a DOCType, but then I’d contradict the server’s mimetype, and down that road lies even more madness.

Nevertheless, the solution, when staring at a blank page in IE when the markup seems fine, is to check your script tags, if any.

I’m no expert at this soup of SGML/HTML/XHTML/XML standards thing, so the above is just my random opinion plus some observations. Still, it seems that MS should patch this particular problem, since it’s fairly non-obvious (many people, I’d surmise, would use the this kind of shorthand close tag in an XHTML file, especially since it validates fine) and upsets the status quo compromise of incremental Web standards compliance through browser compliance modes, content negotiation, and occasionally bad mimetype service. But of course, that’s never going to happen.

Update: I’ve been made aware in the comments section that the same issue occurs in IE 7. Just great.

php 5.2.5 compile error – macro issue

Like the previously mentioned compile problem with transcode, PHP’s tidy extension appears to have a macro-induced collision problem on OS X Tiger. In particular the compile run blows up with:

In file included from /usr/include/tidy/tidy.h:70,
from /tmp/php-5.2.5/ext/tidy/tidy.c:34:
/usr/include/tidy/platform.h:515: error: duplicate 'unsigned'
/usr/include/tidy/platform.h:515: warning: useless type name in empty declaration

This exhibits a similar red herring, in that the compiler says a system include file (tidy/platform.h) is causing the issue. In actuality, this is most likely due to a preprocessor macro issue in PHP. In platform.h:515, we see:
typedef unsigned long ulong;

Now, in main/php_config.h:121:
#define ulong unsigned long

Since php_config.h is set to be included before platform.h, on this particular build configuration, platform.h:515 now becomes:
typedef unsigned long unsigned long;

Hence the compiler error message and the red-herring about your system include file (platform.h).

Since php_config.h is included by just about every main PHP source file, the easier solution is to switch the include order of tidy.h and the php default includes. As such, in tidy.c, we go from this:
#include "php.h"
#include "php_tidy.h"

#if HAVE_TIDY
...
#include "tidy.h"
#include "buffio.h"

to this:
#include "tidy.h"

#include "php.h"
#include "php_tidy.h"

#if HAVE_TIDY
...
#include "buffio.h"

You may also wish to wrap it with #if HAVE_TIDY and #endif, to preserve the original logic (note that tidy.h was originally within the #if), but in my case it seemed to have gone okay without it.

In my case, this compiled just fine with no further complaints. I don’t like doing this – perhaps the PHP-tidy devs had reasons for putting the includes in this order. But from a pragmatic point of view… hey, it compiles.

Again, the moral of the story: when screwing around with macros, try to avoid naming it something that will collide with system libraries.

Updated Feb 6, 2008
As jhardi notes in the comments section, the folks behind tidy have patched their latest version to work around this issue. Kudos to the tidy devs, and to the others who found this bug way before I even had to care about it.

Minor rant: so…this workaround requires that we upgrade tidy. Since Mac OS X doesn’t regularly update its Unix-y interiors , we’re left with the choice of overwriting an Apple system library, or shadowing it and remembering that we shadowed it. I picked option #1, since it’s unlikely that anything is going to blow up due to a new tidy library, but some people are understandably wary of overwriting system libraries (this is, in fact, one of the reasons why package managers like MacPorts stick copies in alternate directories instead).

This still leaves the current stable PHP not compiling with older tidy versions, on the Mac and any other platform using that typedef. And, as a bonus, there is a possibility that we might be doing this again if PHP ever decides to add another library that uses the word “ulong” somewhere.

That’s just lovely. I hope they address this in their future releases.