javarome

How I could parse stackoverflow jobs rss with rome to fill auther and updated fields?


I read stackoverflow jobs rss and get items. This is a sample item:

<item>
    <guid isPermaLink="false">205774</guid>
    <link>https://stackoverflow.com/jobs/205774/java-developer-creative-dock-sro?a=170Dh8tfEzVm</link>
    <a10:author>
        <a10:name>Creative Dock s.r.o.</a10:name>
    </a10:author>
    <category>java</category>
    <category>spring</category>
    <category>spring-boot</category>
    <category>docker</category>
    <category>kubernetes</category>
    <title>JAVA DEVELOPER at Creative Dock s.r.o. (Praha 5, Czechia)</title>
    <description>&lt;p&gt;Are you a&amp;nbsp;&lt;strong&gt;battlehardened Java guy,&lt;/strong&gt; looking forward to&amp;nbsp;learning new technologies? Are you interested in&amp;nbsp;being in&amp;nbsp;a&amp;nbsp;place, where awesome startups are born? Startups that directly impact everyday lives? If&amp;nbsp;your answers are &amp;ldquo;yes&amp;rdquo;, read on...&lt;/p&gt;&lt;br /&gt;&lt;p&gt;&lt;strong&gt;Who are we&amp;nbsp;looking for?&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;&lt;strong&gt;A&amp;nbsp;backend engineer for our new R&amp;amp;D team.&lt;/strong&gt; There&amp;rsquo;s plenty of&amp;nbsp;validated &amp;ldquo;smart&amp;rdquo; projects on&amp;nbsp;our table. Take ownership of&amp;nbsp;one, join the team, and build a&amp;nbsp;working solution (microservices) from scratch&amp;nbsp;-&amp;nbsp;we&amp;nbsp;know you always wanted to&amp;nbsp;do this.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;We&amp;nbsp;are currently looking at&amp;nbsp;&lt;strong&gt;Spring Boot, Docker, Kubernetes&lt;/strong&gt; for the technologies (but are &lt;strong&gt;open to&amp;nbsp;let you change our minds&lt;/strong&gt;).&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Let&amp;rsquo;s be&amp;nbsp;honest: we&amp;nbsp;don&amp;rsquo;t look for total newbies. On&amp;nbsp;the other hand, we&amp;rsquo;ll always be&amp;nbsp;there to&amp;nbsp;help you out with everything.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Have we caught&amp;nbsp;your attention? Great! Send us&amp;nbsp;a&amp;nbsp;link to&amp;nbsp;something you&amp;rsquo;re &lt;strong&gt;proud of.&lt;/strong&gt; May it&amp;nbsp;be your Website or&amp;nbsp;Github. Linkedin will also do&amp;nbsp;the trick&amp;nbsp; And we`ll get in touch!&lt;/p&gt;</description>
    <pubDate>Mon, 18 Mar 2019 10:18:53 Z</pubDate>
    <a10:updated>2019-03-18T10:18:53Z</a10:updated>
    <location xmlns="http://stackoverflow.com/jobs/">Praha 5, Czechia</location>
</item>

I try parse it with this code:

URL feedUrl = new URL("https://stackoverflow.com/jobs/feed");
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new XmlReader(feedUrl));

but it returned:

Title: JAVA DEVELOPER at Creative Dock s.r.o. (Praha 5, Czechia)
Unique Identifier: 205774
Author: null
Updated Date: null
Category: java
Category: spring
Category: spring-boot
Category: docker
Category: kubernetes

I guess, should add namespace ( <rss xmlns:a10="http://www.w3.org/2005/Atom" version="2.0"> ) to fix it, but I haven't any idea how to do that.


Solution

  • I could be doing this the wrong way, but it works (!):

    public static void main(String[] args) throws Exception {
        URL feedUrl = new URL("https://stackoverflow.com/jobs/feed");
        SyndFeedInput input = new SyndFeedInput();
        SyndFeed feed = input.build(new XmlReader(feedUrl));
    
        feed.getEntries()
            .forEach(entry -> {
                System.out.println(get("author", entry.getForeignMarkup()));
                System.out.println(get("updated", entry.getForeignMarkup()));
            });
    
    }
    
    private static String get(String name, List<Element> foreignMarkup) {
        return foreignMarkup.stream()
                            .filter(e -> name.equals(e.getName()))
                            .map(Element::getValue)
                            .findFirst()
                            .orElse(null);
    }
    

    Essentially: call getForeignMarkup() on each SyncEntry, and then look for the Element in the returned list whose name matches "author", "updated", etc...