Some versions of Bigfish Games’ Settlers of Catan (a faithful reproduction of the board game) have a strange issue, in which under certain operating contexts, it will not save a game. The error message reported is a generic and not-at-all useful ” an error has occurred while saving “. I suspected this was due to the fact that it failed to create a savegame directory, and indeed, a bit of sleuthing indicates that on Windows XP, the directory at C:\Documents and Settings\All Users\Application Data\Microsoft\MSN Games\Catan is missing (obviously on Vista, this would be somewhere else – probably C:\Users\…). Instead of creating this directory, Catan simply fails to save the game. The program runs fine otherwise.
Of course, it was not obvious where Catan was trying to save its games – finding out that missing directory was the culprit took a bit of investigation. I took a wild stab at the start by creating a “save” directory in its own program files directory. No such luck. Time to bring out the big guns.
A number of ways could have been used, but one is to use the awesome Sysinternals tool Process Monitor, or Procmon.exe. It tracks events and calls from a process, such as filesystem accesses, and has advanced filtering capability to organize and show only the events of interest to a debugging human.
With ProcMon, I simply filtered on the Catan process and tried to save a game as foo. Then, viewing the event log (screenshot 1), it was obvious that CreateFile calls to create foo.sav failed, with the exact target path specified. A quick Windows Explorer excursion confirms that the path does not exist. Creating that directory, of course, solved the savegame problem.
The moral of the story is that ProcMon is a fine tool for tracking mysterious interactions between an application and a system. For something like failing to make a saved game (in this narrow gamig context) or various system-related errors in general (especially when you lack the source code to debug in depth), it sometimes pays to examine the exact sequence of calls and events that led up to the failure. The solution could be very trivial, if you only knew what and where things failed.