phpxmlxpath

How can I use XPath to perform a case-insensitive search and support non-english characters?


I am performing a search in an XML file, using the following code:

$result = $xml->xpath("//StopPoint[contains(StopName, '$query')]");

Where $query is the search query, and StopName is the name of a bus stop. The problem is, it's case sensitive.

And not only that, I would also be able to search with non-english characters like ÆØÅæøå to return Norwegian names.

How is this possible?


Solution

  • In XPath 1.0 (which is, I believe, the best you can get with PHP SimpleXML), you'd have to use the translate() function to produce all-lowercase output from mixed-case input.

    For convenience, I would wrap it in a function like this:

    function findStopPointByName($xml, $query) {
      $upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZÆØÅ"; // add any characters...
      $lower = "abcdefghijklmnopqrstuvwxyzæøå"; // ...that are missing
    
      $arg_stopname = "translate(StopName, '$upper', '$lower')";
      $arg_query    = "translate('$query', '$upper', '$lower')";
    
      return $xml->xpath("//StopPoint[contains($arg_stopname, $arg_query)");
    }
    

    As a sanitizing measure I would either completely forbid or escape single quotes in $query, because they will break your XPath string if they are ignored.