javaxmlxpathjtidy

Get NodeList from parent who contains text


I want to get all the child nodes from a parent node who contains a certain text within one of them. In other words: I start a search on a certain child node that I'm sure contains some string I need. Once I've found it, instead of getting every other string from nodes that match the same Xpath expression, I need to get the other nodes on its same level. I'm using Java. For example:

     <table width="575" border="0" cellspacing="1" cellpadding="0">
                <tr> 
                  <td width="39" class="back1"><b class="texto4">CRN</b></td>
                  <td width="60" class="back1"><b class="texto4">Materia</b></td>
                  <td width="53" class="back1"><b class="texto4">Secci&oacute;n</b></td>
                  <td width="55" class="back1"><b class="texto4">Cr&eacute;ditos</b></td>
                  <td width="156" class="back1"><b class="texto4">T&iacute;tulo</b></td>
                  <td width="69" class="back1"><b class="texto4">Cupo</b></td>
                  <td width="57" class="back1"><b class="texto4">Inscritos</b></td>
                  <td width="77" class="back1"><b class="texto4">Disponible</b></td>
                </tr>
                <tr> 
                  <td width="39"><font class="texto4"> 
                    10110                        </font></td>
                  <td width="60"><font class="texto4"> 
                    IIND1000                        </font></td>
                  <td width="53"><font class="texto4"> 
                  <div align="center">
                    1                        </div></font></td>
                  <td width="55"><font class="texto4"> 
                    <div align="center">
                    3                       </div>
                    </font></td>
                  <td width="156"><font class="texto4"> 
                    INTROD. INGEN. INDUSTRIAL                        </font></td>
                  <td width="69"><font class="texto4"> 
                    100                        </font></td>
                  <td width="57"><font class="texto4"> 
                    100                        </font></td>
                  <td width="77"><font class="texto4"> 
                    0                        </font></td>
                </tr>
              </table>

If I look for IIND1000, I want to get every td element within that tr tag (10110,IIND1000, 1, 3, INTROD. INGEN. INDUSTRIAL, 100, 100, 0). Is this possible with Jtidy ? Any tips or recommendations? Thanks.


Solution

  • You probably want this:

    XPathExpression expr = 
         xpath.compile("//tr[td[normalize-space(font) = 'IIND1000']]/td/font/text()"); 
    

    The condition in brackets checks the existence of a grandchild node with the desired criteria and will only output all the grandchildren of the matching <tr>.