javahtmlswingparsinghtmleditorkit

Get DIV contents with Java Swing


I'm trying to get DIV contents from previously fetched HTML document. I'm using Java Swing.

final java.io.Reader stringReader = new StringReader(html);
final HTMLEditorKit htmlKit = new HTMLEditorKit();
final HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument();
final HTMLEditorKit.Parser parser = new ParserDelegator();
parser.parse(stringReader, htmlDoc.getReader(0), true);
final javax.swing.text.Element el = htmlDoc.getElement("id");

This code should get a DIV with ID of "id" that I have inside html. But what next? How to get the contents of div? Been searching it all around but only thing I found is how to get attribute value, not the Element contents.

Should I move to jsoup? I would rather use Java native, but so far I'm stuck.

Thanks!


Solution

  • not the Element contents.

    Try something like:

    int start = el.getStartOffset();
    int end = el.getEndOffset();
    String text = htmlDoc.getText(start, end - start);