algorithmsearchnlpfull-text-searchsearch-engine

Increasing relevancy of search results


I have a problem with making search output more practically usefull for the end users. The problem is rather related to the algorithm and approach then to exact technology or framework to use.

At the moment we have a database of products, that can be described with following schema:

http://goo.gl/391qj

From the search perspective we've done pretty standard things, 3-rd party text search with token analyzer, handling mistypes and synonyms (it is not the full list, but as I said, it is rather out of scope). But stil we need to perform extra work to make the search result closer to real life user needs, probably, in somewhat similar way how Google ranks indexed pages by relevancy. Ideas, that we`ve already considered as potentially applicable in solving the problem:

Appreciate for any help or advising a direction, where to dig.


Solution

  • You may try pLSA; there are many references on the web, and there should be libraries and source code.

    EDIT:

    well, I took a closer look at Lucene recently, and it seems to give a much better answer to what the question actually asked (it does not use pLSA). As for the integration with db, you may use Hibernate Search (although it does not seem to be as powerful as using Lucene directy is).