apachefilesolrdocuments

Full-text search on microsoft docs using Apache SolR


Does Apache Solr allow for full text search on Microsoft documents such as word or powerpoint? if so, where can I find a tutorial?


Solution

  • Yes. Solr uses Apache Tika for content extraction and support the majority of file types.

    You'll need to configure a handler in your solrconfig.xml.

    Here's a good starting documentation with examples: https://lucene.apache.org/solr/guide/6_6/uploading-data-with-solr-cell-using-apache-tika.html