phpzend-frameworkluceneindexingzend-lucene

Zend Lucene - how to build an index for Calendar application?


I am building an app that has a Calendar in which users can annotate events.

The Calendar is private to each user, that is, user A cannot see the events on user B's calendar.

I would like to index the calendar events using Zend Lucene, but I'm unsure how to do this.

I could have Lucene index all events together, regardless of user -- but then when a search is done it will show events of one user to the other, and that's not what I want.

I don't think it would be a good idea to create an index exclusive to each user, but am out of ideas how to:

Any ideas/suggestions/pointers on how to do this?


Solution

  • Here's how I solved this issue:

    First, make sure you include user_id field when building the index

    Note that it is wise to use Keyword() for user_id since we want Lucene to search as well as display this data in results.

        $doc = new Zend_Search_Lucene_Document();
    
        $doc->addField(Zend_Search_Lucene_Field::Keyword('user_id', $row->user_id));
        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('date_1', $row->date_1));
        $doc->addField(Zend_Search_Lucene_Field::Text('title', $row->title));
    
        $index->addDocument($doc);
    
         //etc
    

    Next, add a boolean subquery on the backend (programatically) that will force all results to include the query string (user's search input) AND this user's user_id.

        $index = Zend_Search_Lucene::open($this->search_index);
    
        // add user's input to parser
        $query      = Zend_Search_Lucene_Search_QueryParser::parse($query_string);
    
        // add boolean query
        $query_bool = new Zend_Search_Lucene_Search_Query_Boolean();
    
        // add user id as a term
        // note this is saying that a specific `user_id`
        // must be found in a specific field (user_id)
        $user_id    = get_user_id(); // or use your own 'get user id' function 
        $term       = new Zend_Search_Lucene_Index_Term($user_id, 'user_id');
        $subquery1  = new Zend_Search_Lucene_Search_Query_Term($term);
    
        // construct boolean requiring both user id and string
        $query_bool->addSubquery($query, true);     // required
        $query_bool->addSubquery($subquery1, true); // required
    
        $query_result = $index->find($query_bool);
    

    And there you have it.

    Now if user 123 searches for 'appointment', Lucene will make the search actually be something like appointment AND user_id=123.

    Let me know if there's any way to improve this - glad to discuss.