solrsolr5banana

Solr: Retrieve non-stored fields from external data source


I'm currently working on a project on which I would like to index several data sources (Oracle and HBase) into Solr for full text search. Additionally, I want to be able to visualize the data I index into Solr. I'm still evaluating on whether to use Banana or Hue for this.

Here comes the problem: As far as I understood the Solr docs, I can only search on indexed, but non-stored, fields, but not retrieve their original contents. I suppose this will make it quite difficult for the visualizers to produce some nice, labeled graphs for me ;)

I would really like to avoid storing the fields as the actual data could grow quite big eventually and it is already stored inside another database. Is there some plugin (another SearchHandler, maybe?), which is able to retrieve the matching datafields from an external datasource to be able to deliver them together with the search results? If not, where would be the best place to implement such a functionaliy? A Solr SearchHandler? Banana/Hue?

Thank you very much in advance for any suggestions! :)


Solution

  • IMHO, the best way to implement such functionality is as a SearchHandler that returns Banana "compatible" response. You should index the fields that you need to be searchable without storing them in Solr. The search handler should retrieve corresponding rows from HBase according to search results which would enable labeled data in Banana. In a separate process, you also have to maintain the index periodically when HBase data are added, updated, etc. The first use case here is very similar to yours.