Jump to content

4.0 - Rethinking XML Handling

The IPS Social Suite stores skin and language (and some other stuff) in xml files which are imported into the database at installation and upgrade. The reason we do it this way is so of course, you can export skins and languages and install them on other sites or distribute them via the IPS Marketplace.

I'm not the biggest fan of PHP's XML handling at the best of times (it would seem whoever wrote the SimpleXML class and I would disagree on the definition of "simple") and had already changed some of the places we use XML to use JSON instead, but these skins and languages are particularly difficult because they can get huge. Handling these large files, especially when combined with sub-par servers can lead to memory exhaustion or maximum execution time timeouts. Our current method was taking both too much memory, and too much time in a single HTTP request.

So we had to come up with something better. At first we considered splitting the XML file into several, but that would mean either requiring people installing a new skin/language to upload multiple files (which we deemed not acceptable) or compressing them in some way (which is another of PHP's weaknesses, and would have just created another problem). We also researched different markup languages, however found similar problems with everything. We also had to make sure that any solution didn't step on the toes of any of our other 4.0 goals, in particular we want to completely eliminate writing cache files to disk so as to better support cluster environments.

The solution was two-fold. To resolve the memory problem, we decided to use the XMLReader and XMLWriter classes - these are non-cached, forward-only classes for reading and writing XML. Rather than store the document in memory, you can only read/write one node at a time, and can only move forward, not backwards. Even with this approach though, we needed to account for the fact that importing is an intensive operation anyway, and one that needs to be staggered. To accommodate this, we wrote a simple AJAX-based "redirector", which continuously fires HTTP requests at a PHP controller until the controller reports the import is finished (it displays a fancy progress bar to the user while this is happening).

We decided to make the AJAX redirector a helper class so that it can be used elsewhere and by third-party developers (rather than our current approach in some areas of making the user sit through loads of physical redirects). The code is really simple:

IPSOutput::i()->output = new IPSHelpersMultipleRedirect(
	/* Query string which will take us back to this controller */
	/* Code to run for each HTTP request */
	function( $data )
		// If we're not done yet...
		return array(
			$data,				// Data to get passed back to this function,
			"Processing...",	// Message to show (via AJAX) to the user
			50,					// [Optional] Number between 1 and 100 to indicate progess for progress bar
		// If we're done
		return NULL;
	/* Code to run when we're done */
		// Code that runs when we're done

  • Create New...