I'm using Zend Lucene to index my documents and use it to search documents by keywords.
But now I need to retrieve all the contents of a specific file, by using this id.
Here is how I index a doc in Zend
$doc = new \Zend_Search_Lucene_Document();
$doc->addField(\Zend_Search_Lucene_Field::UnIndexed('Id', (integer)$args['resId']));
$doc->addField(\Zend_Search_Lucene_Field::UnStored('contents', $fileContent, 'utf-8'));
$index->addDocument($doc);
$index->commit();
Is there a way to retrieve the contents
using and Id
?
I try the following, without success
$term = new \Zend_Search_Lucene_Index_Term(762, 'Id');
$docIds = $index->termDocs($term);
$term = new \Zend_Search_Lucene_Index_Term(762, 'Id');
$query = new \Zend_Search_Lucene_Search_Query_Term($term);
$hits = $index->find($query);
Thanks
Its been a long time since I worked with Lucene (this code is from an app I built in 2013).
The short answer to your question is with your current setup indexing the "contents" as unStored no, you will not be able to display the entire contents using its id. I would also not suggest that if the documents are of any real size. The index is optimized and storing large amounts of text types for displaying will really slow the queries down.
Only certain types of fields can be used for rendering back to the user anything meaningful (see below).
Of course remember to always make sure you are not storing and rendering data that has not been sanitized/escaped etc etc. If you store it you need to make sure its safe for rendering. This is just the code that should get you running. Sorry its a little old.
Given a custom Document definition: (used to create a searchable index of events)
<?php
class Search_Lucene_Document extends Zend_Search_Lucene_Document
{
public function __construct($fields)
{
foreach($fields as $key => $value) {
switch($key) {
case 'docRef' : // DO NOT MODIFY
$this->addField(Zend_Search_Lucene_Field::keyword($key, $value));
break;
case 'url' : // DO NOT MODIFY
$this->addField(Zend_Search_Lucene_Field::keyword($key, $value));
break;
case 'class' :
$this->addField(Zend_Search_Lucene_Field::unIndexed($key, $value));
break;
case 'key' :
$this->addField(Zend_Search_Lucene_Field::unIndexed($key, $value));
break;
case 'summary' :
$this->addField(Zend_Search_Lucene_Field::unIndexed($key, $value));
break;
case 'title' :
$this->addField( Zend_Search_Lucene_Field::text($key, $value));
break;
case 'content' :
$this->addField( Zend_Search_Lucene_Field::unStored($key, $value));
break;
case 'contents' :
$this->addField( Zend_Search_Lucene_Field::unStored($key, $value));
break;
case 'eventContent' :
$this->addField( Zend_Search_Lucene_Field::text($key, $value));
break;
case 'categories' :
$this->addField( Zend_Search_Lucene_Field::text($key, $value));
break;
case 'addrState' :
$this->addField( Zend_Search_Lucene_Field::text($key, $value));
break;
case 'startDate' :
$this->addField( Zend_Search_Lucene_Field::keyword($key, $value));
break;
case 'endDate' :
$this->addField( Zend_Search_Lucene_Field::keyword($key, $value));
break;
default :
$this->addField(Zend_Search_Lucene_Field::unStored($key, $value));
break;
}
}
}
}
Important points from above as best as memory can recollect it. Only information you store as ::text, ::keyword and maybe a couple other types can be used as is when queried back. So, now to query what we saved.
This was the event search controller.
public function searchAction()
{
$form = new Events_Form_Search();
if($this->_request->isPost())
{
if($form->isValid($this->_request->getPost()))
{
$post = $this->_request->getPost();
$start = $post['startDate'];
$end = $post['endDate'];
$from = new Zend_Search_Lucene_Index_Term($start, 'startDate');
$to = new Zend_Search_Lucene_Index_Term($end, 'startDate');
$query = new Zend_Search_Lucene_Search_Query_Range($from, $to, true);
$index = Search_Service_Lucene::open(APPLICATION_PATH . DIRECTORY_SEPARATOR . 'data' . DIRECTORY_SEPARATOR . 'search');
$hits = $index->find($query);
//Zend_Debug::dump($hits);
$this->view->totalHits = count($hits);
$filteredHits = array();
$resultsArray = array();
foreach($hits as $i => $hit) {
$resultsArray[$i] = new stdClass();
$doc = $hit->getDocument();
foreach($doc->getFieldNames() as $field) {
$resultsArray[$i]->{$field} = $hit->{$field};
}
}
$paginator = new Zend_Paginator(new Zend_Paginator_Adapter_Array($resultsArray));
//TODO: Add a app setting to control the number of results per page
$paginator->setItemCountPerPage(10);
$paginator->setCurrentPageNumber($this->_request->page);
$this->view->paginator = $paginator;
}
}
else {
}
}
And now to display it using Zend_View partial loop helper and a template file. You should note the eventContent in the below matches the above code and is saved in the index as ::text. Also, notice that I use id to link them back to the actual event because I wanted to keep the indexes as small as possible to improve performance. Only store what you really need to identify the document to the user as readable text or keyword types, then make your query and send them to the full document.
<div class="event-listing">
<h4>
<?php echo $this->title($this->model->title); ?>
</h4>
<p>
<?php echo !empty($this->model->addrCity) ? $this->model->addrCity . ',' : '';?> <?php echo !empty($this->model->addrState) ? $this->model->addrState : ''; ?>
</p>
<p>
<span>
<?php echo date('F j, Y', strtotime($this->model->startDate)); ?> - <?php echo date('F j, Y', strtotime($this->model->endDate)); ?>
<br />
</span>
</p>
<p>
<?php
echo $this->model->eventContent;
?>
</p>
<p class="moreinfo"><a href="/events/display/<?php echo $this->model->id; ?>">More info <span>»</span></a></p>
<hr />
</div>