javarssrome

Java ROME RSS library and HTML code in RSS description field


I need to include HTML code into my RSS feed. I use Java ROME RSS library:

SyndFeed feed = new SyndFeedImpl();
feed.setFeedType("rss_2.0");

feed.setTitle("Title");
feed.setLink("example.com");
feed.setDescription("Description");

List<SyndEntry> entries = new ArrayList<>();

SyndEntryImpl entry = new SyndEntryImpl();
entry.setTitle("Name");

SyndContent syndContent = new SyndContentImpl();
syndContent.setType("text/html");
syndContent.setValue("<p>Hello, World !</p>");

entry.setDescription(syndContent);

entries.add(entry);

feed.setEntries(entries);

Writer writer = new FileWriter("rss.xml");
SyndFeedOutput output = new SyndFeedOutput();
output.output(feed, writer);
writer.close();

but the output XML contains encoded description:

<description>&lt;p&gt;Hello, World !&lt;/p&gt;</description>

How to properly include unencoded HTML code with ROME?


Solution

  • Analysis

    According to the RSS Best Practices Profile: 4.1.1.20.4 description:

    The description must be suitable for presentation as HTML. HTML markup must be encoded as character data either by employing the HTML entities &lt; ("<") and &gt; (">") or a CDATA section.

    Therefore, the current output is correct.

    CDATA encoding

    If it is desirable to have CDATA section (CDATA encoding), the following piece of code may be used:

    final List<String> contents = new ArrayList<>();
    contents.add("<p>HTML content is here!</p>");
    
    final ContentModule module = new ContentModuleImpl();
    module.setEncodeds(contents);
    
    entry.getModules().add(module);
    

    Additional references

    1. RSS Best Practices Profile.
    2. Putting content:encoded in RSS feed using ROME - Stack Overflow.
    3. Re: CDATA Support - Mark Woodman - net.java.dev.rome.dev - MarkMail.
    4. rome-modules/ContentModuleImplTest.java at master · rometools/rome-modules · GitHub.

    description versus content:encoded

    Should I use both - description and content:encoded nodes or only one of them in my RSS feed item ?

    And how about the following?

    An item may also be complete in itself, if so, the description contains the text (entity-encoded HTML is allowed; see examples), <…>

    According to the RSS 2.0 specification, using the description element is enough: exactly as you have quoted. Here are the examples: Encoding & item-level descriptions (RSS 2.0 at Harvard Law).

    For additional details please refer to the question: Difference between description and content:encoded tags in RSS2 - Stack Overflow.