phpxmlphpquery

PHPquery is removing capitalization for markup tag names when it shouldn't


I am doing some work with the amazon api(s) and have run into an issue while creating new documents in PHP query. I am using version 0.9.5

var_dump($innerHTML);
$php_query=\phpQuery::newDocument("<root>".$innerHTML."</root>");
var_dump($php_query->html());

Gives the readout:

    string '<TotalOffers>2</TotalOffers><TotalOfferPages>1</TotalOfferPages>   <MoreOffersUrl>https://www.amazon.co.uk/gp/offer-listing/B01KI13K0W%3FSubscriptionId%3DAKIAICCWYPWR3A76YQDQ%26tag%3D8496-5230-2708%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12734%26creativeASIN%3DB01KI13K0W</MoreOffersUrl><Offer>
  <Merchant>
    <Name>Amazon.co.uk</Name>
  </Merchant>
  <OfferAttributes>
    <Condition>New</Condition>
  </OfferAttributes>
  <OfferListing>
    <OfferListingId>ZUHqmK4tgaBMBcOsWWSBV%2B%2FwXbnd%2BSYg58EFdjcH93Ayh%2FnSp741DyflKxdwL7w7k4O7nJ9zTBkxRNymmCq%2BNtgwHMXrFlFZRtfv0VL%2BBm8%3D</OfferListingId>
    <Price>
      <Amount>21999</Amount>
      <CurrencyCode>GBP</CurrencyCode>
      <FormattedPrice>£219.99</FormattedPrice>
    </Price>
    <Availability>Usually dispatched within 24 hours</Availability>
    <AvailabilityAttributes>
      <AvailabilityType>now</AvailabilityType>
      <MinimumHours>0</MinimumHours>
      <MaximumHours>0</MaximumHours>
    </AvailabilityAttributes>
    <IsEligibleForSuperSaverShipp'... (length=2055)

And...

string '<root><totaloffers>2</totaloffers><totalofferpages>1</totalofferpages><moreoffersurl>https://www.amazon.co.uk/gp/offer-listing/B01KI13K0W%3FSubscriptionId%3DAKIAICCWYPWR3A76YQDQ%26tag%3D8496-5230-2708%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12734%26creativeASIN%3DB01KI13K0W</moreoffersurl><offer><merchant><name>Amazon.co.uk</name></merchant><offerattributes><condition>New</condition></offerattributes><offerlisting><offerlistingid>ZUHqmK4tgaBMBcOsWWSBV%2B%2FwXbnd%2BSYg58EFdjcH93Ayh%2FnSp741DyflKxdwL7w7k4O7nJ9zTBkxRNymmCq%2BNtgwHMXrFlFZRtfv0VL%2BBm8%3D</offerlistingid><price><amount>21999</amount><currencycode>GBP</currencycode><formattedprice>£219.99</formattedprice></price><availability>Usually dispatched within 24 hours</availability><availabilityattributes><availabilitytype>now</availabilitytype><minimumhours>0</minimumhours><maximumhours>0</maximumhours></availabilityattributes><iseligibleforsupersavershipping>1</iseligibleforsupersavershipping><iseligibleforprime>1</iseligibleforprime></offerlisting>'... (length=1846)

As you can see tag capitalization is lost. Is this default behavior for PHPquery and if so is there any way to maintain tag capitalization?

\phpQuery::newDocument does this still without the <root> tags being there.

(please note that the aim of this question is purely about tags loosing there capitalization when using \phpQuery::newDocument and not the efficacy of using PHPquery in this context)

Thanks.

Help with this issue would be much appreciated :) .

EDIT: answered (Thanks for the help!) \phpQuery::newDocument() is case insensitive and \phpQuery::newDocumentXML() should be used instead for maintaining case


Solution

  • Change:

    var_dump($innerHTML);
    $php_query=\phpQuery::newDocument("<root>".$innerHTML."</root>");
    var_dump($php_query->html());
    

    to:

    var_dump($innerHTML);
    $php_query=\phpQuery::newDocumentXML("<root>".$innerHTML."</root>");
    var_dump($php_query->html());
    

    As HTML is case insensitive for its tag names whereas XML is not. Therefore \phpQuery::newDocument() is case insensitive and does not maintain the capitalization of tag names. Whereas \phpQuery::newDocumentXML() should be used instead for maintaining case of tag names.