searchsolrlucenefull-text-searchzend-lucene

Lucene: how to search EAV or 1:m?


I'm using Zend Lucene, but don't think the question is specific to that library.

Say I want to provide fulltext search for a database of books. Assume following models:

Model 1:

TABLE: book
- book_id
- name

TABLE: book_author
- book_author_id
- book_id
- author_id

TABLE: author
- author_id
- name

(a book can have 0 or more authors)

Model 2:

TABLE: book
- book_id
- name

TABLE: book_eav
- book_eav_id
- book_id
- attribute (e.g. "author")
- value (e.g. "Tom Clancy")

(a book can have 0 or more authors + information about publisher, number of pages, etc.)

What do I need to do in order to insert all the authors associated with a particular book in a document to be indexed? Do I put all the authors in one field in the document? Would I use some sort of delimiter to group author information? I'm looking for general strategies with this kind of data.


Solution

  • Put all the authors in one field in the document with a delimiter. So the document schema will be:

    book_id
    name
    author: |author 1|author 2|...|author n|
    other_attribute_1: |val 1|val 2|
    other_attribute_2: |val 1|val 2|
    

    With this schema you can search by author with different boosts with a query like:

    (author:"|Tom Clancy|")^10 OR 
    (author:"Tom Clancy")^5 OR 
    (author:Tom Clancy)^1
    

    This query will show the exact matches first, phrase matches then and finally other matches.