«^»
5.1. Creating a tool in the PHP language

From the early days of XML, it has been reasonably easy to write programs that read XML documents. To begin with, these programs were mainly written in Java, but now it is easy to write such programs in a number of programming languages (including PHP).

Although the code of a PHP file is normally triggered into action by a webserver, it is now possible to run standalone PHP programs, i.e., to run PHP programs from a Unix prompt or from a Windows command shell window. Although the support for this is experimental in PHP 4.2.3, it is fully supported in PHP 4.3.0 (which is currently only in pre-release). We now look at how to write a PHP program that processes an XML document.

The XML parser that comes with PHP allows you to write PHP code that walks through an XML document. It generates events for each part of a document, generating an event for each start tag, for each end tag, and for each occurence of characters appearing between a start tag and an end tag. In a PHP program, you can nominate functions to be called when these events occur.

Let's look at this in more detail. In a PHP program, you first need to create an XML parser:

$xmlparser = xml_parser_create();
You then register the functions that you wish to be called. For example, given:
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
the startElement function will be called when the parser meets a start tag; endElement will be called when it meets an end tag; and characterData will be called for the characters in between.

Having done all that, you then set the XML parser loose on your document:

$fp = fopen("consumables.xml", "r");
while ($data = fread($fp, 4096)) {
   xml_parse($xml_parser, $data, feof($fp));
}

Here is a complete program that reads consumables.xml and outputs an HTML table:

<%
   function startElement($parser, $name, $attrs) {
      if ($name == "consumables") {
         print "<table>";
      }
      else if ($name == "product") {
         print "<tr>";
      }
      else {
         print "<td>";
      }
   }
   
   function endElement($parser, $name) {
      if ($name == "consumables") {
         print "</table>";
      }
      else if ($name == "product") {
         print "</tr>";
      }
      else {
         print "</td>";
      }
   }
   
   function characterData($parser, $data) {
      print $data;
   }
   
   function startDocument() {
      print "<html>\n";
      print "<body>\n";
   }
   
   function endDocument() {
      print "</body>\n";
      print "</html>\n";
   }
   
   $file = "consumables.xml";
   $xml_parser = xml_parser_create();
   // insist that the tags have the right case
   xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, false);
   xml_set_element_handler($xml_parser, "startElement", "endElement");
   xml_set_character_data_handler($xml_parser, "characterData");
   if (!($fp = fopen($file, "r"))) {
      die("could not open XML input");
   }
   startDocument();
   while ($data = fread($fp, 4096)) {
      if (!xml_parse($xml_parser, $data, feof($fp))) {
         die(sprintf("XML error: %s at line %d",
                     xml_error_string(xml_get_error_code($xml_parser)),
                     xml_get_current_line_number($xml_parser)));
      }
   }
   xml_parser_free($xml_parser);
   endDocument();
%>

The xml_parse function walks through an XML document calling your three functions as it reaches appropriate points of the XML document.

Note that each part of the XML document is visited only once. If instead you want the ability to wander around an XML document, it is better to use the xml_parse_into_struct function: this will put a representation of the XML document into an array. PHP also has some DOM XML functions, but these are currently experimental.