oracleindexingoracle-text

text search in Oracle


I have created a table

CREATE TABLE "SCOTT"."TEXTSEARCH" 
   (    "NAME" VARCHAR2(20 BYTE), 
    "RESUME" BLOB, 
    "TEXT" CLOB
   )

I have inserted values which include a Word 2013 document that represents the resume. The text of the resume is

Hi This Is My Resume  
Testing  
HTML  
999999

I then created a Text index

CREATE INDEX "SCOTT"."RESUMETEXTINDEX" ON "SCOTT"."TEXTSEARCH" ("RESUME") 
   INDEXTYPE IS "CTXSYS"."CONTEXT"  PARAMETERS ('LEXER ctxsys.basic_lexer');

When I run the query

select * from textsearch WHERE contains(resume, 'testing')> 0;

I am getting result but when I search for the word "my", I do not get any results

select * from textsearch WHERE contains(resume, 'my')> 0;

nor

select * from textsearch WHERE contains(resume, 'My')> 0;

return results


Solution

  • An Oracle Text index does not index common words that users are not expected to search on. These are called stopwords. I don't know every word in the default stopword list that your index is using but I would expect that "my" is on that list.

    You can modify the default stoplist or create your own but unless you really want users to be able to query just on the word "my", which seems unlikely, I wouldn't. In addition to making the index smaller (and faster to search), stopwords make searching phrases much friendlier. The phrase "my resume" would match "John's resume" or "the resume" if "my" is a stopword because a stopword matches any word. For humans, that is generally a good thing-- most people most of the time don't want to miss a result because a common verb happened to have a different tense in the document ("is" vs "was" for example).

    See section 4.1.5 Querying Stopwords in the Text Application Developer's Guide for more details.