javaxpathjava-11htmlcleaner

How to parse html with xpath?


I ran into a few questions on StackOverflow that asked about parsing html with xpath using Java.

This is the best answer I found so far here.

But it looks like DomSerializer is no longer available in Java 11.

How can I use DomSerializer in Java 11 ?


Solution

  • Add the below dependency to the pom.xml :

    <dependency>
        <groupId>net.sourceforge.htmlcleaner</groupId>
        <artifactId>htmlcleaner</artifactId>
        <version>2.6.1</version>
    </dependency>
    

    Or

    You can download htmlcleaner-2.6.1.jar from here.

    That jar (or artifact) has the DomSerializer class within it.

    Link to read docs : http://htmlcleaner.sourceforge.net/doc/org/htmlcleaner/DomSerializer.html