phpjavascriptxmlxml-parsingfxg

how to parse xml file and grab text values


The following is just a snipplet of code from a large fxg file, which is basically just a xml file:

<RichText x="14.1655" y="46.5674" columnGap="18" columnCount="1" textAlign="left" fontFamily="Bootstrap" color="#53836A" whiteSpaceCollapse="preserve" width="202.712" height="13.334" s7:caps="none" s7:colorName="" s7:colorValue="#B24FA41C" s7:colorspace="cmyk" s7:elementID="line1" s7:fill="true" s7:fillOverprint="false" s7:firstBaselineOffset="ascent" s7:joints="miter" s7:maxFontSize="12" s7:miterLimit="10" s7:referencePoint="inherit" s7:rowCount="1" s7:rowGap="18" s7:rowMajorOrder="true" s7:stroke="false" s7:strokeOverprint="false" s7:warpBend="0.5" s7:warpDirection="horizontal" s7:warpHorizontalDistortion="0" s7:warpStyle="none" s7:warpVerticalDistortion="0" s7:weight="1" ai:aa="2" ATE:C_charRotation="0" ATE:C_horizontalScale="1" ATE:C_kerning="metric" ATE:C_verticalScale="1" ATE:P_autoHyphenate="true" ATE:P_consecutiveHyphenLimit="0" ATE:P_hyphenateCapitalized="true" ATE:P_hyphenatedWordSize="6" ATE:P_hyphenationPreference="0.5" ATE:P_hyphenationZone="36" ATE:P_postHyphenSize="2" ATE:P_preHyphenSize="2" d:userLabel="id:line1">
   <content><p><span>Address Line 1</span></p></content>
</RichText>

There are many nodes in the XML file that have a similar structure. But each RichText node has a unique element id, s7:elementID="line1" in this case.

Using PHP or JavaScript, how can I grab either:

  1. the text "Address Line 1"
  2. the whole line including content,p,span tags

If I specify the elementID I want the content from?

I'm not very familiar with XML so I'm not sure if this is even possible?


Solution

  • load the xml into an object with: simplexml_load_string()

    then use ->xpath('RichText') on that object to get the RichText elements.

    if you use ->asXML() on thos elements,

    you get "<content><p><span>Address Line 1</span></p></content>"

    is it always "<content><p><span>"?

    then you can use (string) $RichText->content[0]->p[0]->span[0]