finding a fault-tolerant HTML parser for iPhone SDK

A new SelfSolved problem is ready for perusal:

A couple of my iPhone projects require a decent HTML/XHTML parser. On OS X, Cocoa ships with NSXMLDocument, which includes dirty HTML parsing functionality from libtidy. Unfortunately, NSXMLDocument is not part of the actual iPhone 2.2 SDK (though it is part of the 2.2 Simulator — so it’ll compile just fine at dev time but break when deploying — a big gotcha if you never tested against a real iPhone).

NSXMLParser is a part of the iPhone SDK…This is not a reasonable alternative.

Check out my writeup at SelfSolved #42: HTML or XHTML Parser for iPhone SDK 2.x

Finally, all out of all the potential alternatives I found (all referenced at the SelfSolved writeup — including one that requires a license fee to use), this one seems to be the most promising and requires the least amount of pain (read: interaction with the libxml C API — god knows I’ve done enough of that while building prototypes at Yahoo! Research Berkeley)

Extracting Address Book information

If you’ve left information in the Address Book in OS X, any other program on the machine can use Cocoa APIs to extract information automatically.

#import <Foundation/Foundation.h>
#import <AddressBook/AddressBook.h>
#import <AppKit/NSWorkspace.h>

void sayHello(void)
{
   ABPerson *currUser = [[ABAddressBook sharedAddressBook] me];
   NSString *firstName = [currUser valueForProperty:kABFirstNameProperty];
   NSString *lastName = [currUser valueForProperty:kABLastNameProperty];
   [[NSWorkspace sharedWorkspace] 
                           openURL:[NSURL URLWithString:
                                                     [NSString stringWithFormat:
                                                     @"http://www.google.com/search?hl=en&q=%@+%@", 
                                                          firstName,
                                                          lastName]]];
}

int main (int argc, const char * argv[])
{
   NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
   sayHello();
   [pool release];
   return 0;
} 

This will take the name labelled as the currently logged in user and hit up Google, placing the first and last name in the search box and run a search with it. Other pieces of information are also accessible via the AddressBook API.

As you can well see, this can be used for many purposes: nifty automagical hacks in the OS’s built-in suite of personal information and communcation apps, for example. But I can think of less beneficial uses for this API. You should be able to trust the programs you run.