javaxmlrssrome

Parsing a feed with rss version="2"


I am trying to parse an RSS feed with Java ROME which has this incorrect version:

<rss version="2">

When I change it to "2.0" it parses correctly. How can I work around this using Java ROME?

I could subclass RSS20Parser and override isMyType but where and how do I register this new parser?


Solution

  • I solved this by creating a subclass of RSS20Parser and by overriding isMyType. Then I copied rome.properties, added the subclass to the list of parsers in WireFeedParser.classes and placed this file on the classpath. I happened to be programming in Clojure and here is the code:

    (ns feeds.rss20-parser
      (:import (com.rometools.rome.io.impl RSS20Parser)
               (org.jdom2 Document))
      (:gen-class
       :name feeds.RSS20Parser
       :extends com.rometools.rome.io.impl.RSS20Parser
       :exposes-methods {isMyType parentIsMyType}))
    
    (defn version [^Document doc]
      (some-> doc
              .getRootElement
              (.getAttribute "version")
              .getValue
              .trim))
    
    (defn -isMyType [^feeds.RSS20Parser this ^Document doc]
      (or (.parentIsMyType this doc)
          (= "2" (version doc))))