xmlxsltsolrsolrjsolrcloud

XML to XML converstion of Solr Stander Format using XSLT


This is my sample XML file for convert to a different format.Which i want to convert SOLR standered formate to upload xml file. i try to convert xml file using Xslt to solr format. but its only for working first section. i have to convert for all element. can i converted as my desired output. if any relative article please share.

<?xml version="1.0"?>
<article>
<section xml:id="s495f">
    <title xml:id="h4cd0"> ID</title>
    <para xml:id="p75998"> User_name</para>
</section>
<section xml:id="s495f">
    <title xml:id="h4cd0"> ID</title>
    <para xml:id="p75998"> User_name</para>
    <para xml:id="pfa"> abbccddefg</para>
</section>
<section xml:id="s495f">
    <title xml:id="h4cd0"> ID</title>
    <para xml:id="p75998"> User_name</para>
    <para xml:id="pfa"> Test</para>
</section>
</article>

I try to convert to Solr Standered format using XSLT. Here is my XSLT file:

 <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

 
 <xsl:template match="/article">
<add>
<doc>
    <xsl:apply-templates select="section"/>
</doc>
</add>
  </xsl:template>

  <xsl:template match="para">
    <field name="para {@xml:id}">
     <xsl:value-of select="." />
    </field>
  </xsl:template>

  <xsl:template match="title">
    <field name="title {@xml:id}">
     <xsl:value-of select="." />
    </field>
   </xsl:template>

</xsl:stylesheet>

my output like this: which are only for single "section" element, i try to convert for every "section" element

my output:

<?xml version="1.0" encoding="UTF-8"?>
<add>
 <add>
    <doc>
    <field name ="title h4cd0"> ID</field>
    <field name = "para p75998"> User_name</field>
    <field name = "para pfa"> xyxzzc</field>
    <field name = "para  p90f4b1"> location: details</field>
    <field name = "para p43cecf4"> Job profile</field>
    <field name = "para p75d4cc799"> refrence Id</field>
    <field name = "para p628c34"> True</field>
    </doc>
</add>

my desired output:

<add>
    <doc>
    <field name ="title h4cd0"> ID</field>
    <field name = "para p75998"> User_name</field>
    </doc>
    <doc>
    <field name ="title h4cd0"> ID</field>
    <field name = "para p75998"> User_name</field>
    <field name = "para pfa"> abbccddefg</field>
    </doc>
    <doc>
        <field name ="title h4cd0"> ID</field>
        <field name = "para p75998"> User_name</field>
        <field name = "para pfa"> Test</field>
    </doc>
</add>

Solution

  • You are creating the Solr doc element in the template that matches article, but in fact you want to create a doc for every section. You need to create a template that matches section and move the doc element into it (the article template should create only the Solr add element).

    NB spaces are not allowed in Solr field names. Try an underscore instead.

    <xsl:stylesheet version="1.0" 
                      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" indent="yes"/>
     
      <xsl:template match="/article">
        <add>
          <xsl:apply-templates select="section"/>
        </add>
      </xsl:template>
    
      <xsl:template match="section">
        <doc>
          <xsl:apply-templates/>
        </doc>
      </xsl:template>
    
      <xsl:template match="para">
        <field name="para_{@xml:id}">
         <xsl:value-of select="." />
        </field>
      </xsl:template>
    
      <xsl:template match="title">
        <field name="title_{@xml:id}">
         <xsl:value-of select="." />
        </field>
      </xsl:template>
    
    </xsl:stylesheet>