indexinglucenejcrjackrabbit-oak

Apache Jackrabbit Oak 1.8 Indexing - Lucene does not index binary properties in aggregated node


I have following index:

oak:index
      jcr:primaryType = nt:unstructured
      dms-lucene-fulltext-index
         compatVersion = 2
         async = async
         jcr:primaryType = oak:QueryIndexDefinition
         evaluatePathRestrictions = true
         type = lucene
         tags = fulltext
         aggregates
            jcr:primaryType = nt:unstructured
            nt:file
               jcr:primaryType = nt:unstructured
               include0
                  path = jcr:content
                  jcr:primaryType = nt:unstructured

And i have following file node in a folder:

folder
   jcr:created = 2018-02-24T14:32:09.550+01:00
   jcr:createdBy = 
   jcr:primaryType = nt:folder
   jcr:uuid = 5c3e4689-84e9-4e34-8b14-029f62172812
   test.txt
      jcr:created = 2018-02-24T14:32:09.674+01:00
      jcr:createdBy = 14
      jcr:primaryType = nt:file
      jcr:content
         jcr:encoding = utf-8
         jcr:lastModifiedBy = 14
         jcr:mimeType = text/plain; charset=utf-8
         jcr:data = the quick brown fox
         jcr:lastModified = 2018-02-24T14:32:09.673+01:00
         jcr:primaryType = nt:resource
         jcr:uuid = 52f224e8-db57-4879-9d6a-94862f65fb8d

If I execute following query, i get that file as result:

SELECT * FROM [nt:file] WHERE ISDESCENDANTNODE('/folder') AND CONTAINS(*,'plain')

So the mimeType is in the index. But the binary not, cause following query has no result:

SELECT * FROM [nt:file] WHERE ISDESCENDANTNODE('/folder') AND CONTAINS(*,'fox')

I hope that anyone can tell me what I'm doing wrong here, thank you!


Solution

  • After long investigation, i finally found a solution for the problem.

    I added following dependency to my pom.xml:

    <dependency>
        <groupId>org.apache.tika</groupId>
        <artifactId>tika-parsers</artifactId>
        <version>RELEASE</version>
    </dependency>
    

    Then my custom tika configuration is also loaded and binary properties are indexed!