regexapachesolrapache-tikahandles

Indexing PDF - Faceted Search with Apache Solr and Apache Tika


Two weeks ago I'm having trouble finding the Internet a way for my solution. I need to integrate a web application with Apache Solr and Apache tika, to be made faceted search PDF's that are in the database of the system. The configuration of solr and tika on my server everything is ok, but as I am new with these two tools, I'm not sure how to integrate one another and also with the application.


Solution

  • Solr 6.2 ships with files example in the example/files that is configured specifically to index and browse rich-content files (like PDF).

    Start by using that and try to understand how it is put together.