javaoracle-databaseplsqloracleformsoracle-text

Oracle Text optimize query for a Google like search bar


First I can't provide the implementation code since I believe it is useless, so I will try to explain in detail what I've done so far.

Basically I need to develop a Google or Amazon like search bar which will suggest products to end users by name.

The search bar and the suggestion panel are written in plain Java (Swing). While the data is stored in a Oracle (10g) DB table. Since I need to search over a VARCHAR2 column I looked for a full-text search package and found Oracle Text. The searching is superfast when using the SQL developer or SQL*Plus but it is much slower when the searching is done via de search bar.

The table looks as follows and it has around 220k rows, on the text column I created a basic CONTEXT index for Oracle Text.

CREATE TABLE inventory (
   text VARCHAR2(200)
);

CREATE INDEX idx_inventory ON inventory(text) INDEXTYPE IS CTXSYS.CONTEXT;

The searching is implemented in an DB package with a cursor, also in the same package I do the query parsing manually and it also works fast. Since it works very fast alone I don't think that the problem is in the package but I will provide a short explanation.

CREATE OR REPLACE PACKAGE BODY pkg_inventory 
AS
  FUNCTION f_parse_query(p_query IN VARCHAR2) RETURN VARCHAR2 
  IS
    v_query VARCHAR2(32767) := '';
  BEGIN
    -- it always returns a parsed string with the following format
    -- {foo} & {bar} & baz%
    -- all the words are escaped unless for the last one, at the end of which the "%" is added
    RETURN v_query;
  END f_parse_query;

  FUNCTION f_search(p_query IN VARCHAR2) RETURN VARCHAR2 
  IS
    CURSOR c_inventory IS 
           SELECT  text
               FROM inventory
           WHERE CONTAINS(text, v_query, 1) > 0
           ORDER BY score(1) DESC;

    v_query VARCHAR(32767);
    v_res   VARCHAR(32767);

    TYPE t_result IS TABLE OF VARCHAR(32767) INDEX BY BINARY_INTEGER;
    tab_res t_result;
  BEGIN
    v_query := f_parse_query(p_query);
    OPEN c_inventory;
    FETCH c_inventory BULK COLLECT INTO tab_res LIMIT 10;
    CLOSE c_inventory;

    -- concatenate the result in a string and return
    RETURN v_res;
  END f_search;
END pkg_inventory;

To make the searching experience responsive I redo the whole process after every typed char and the real bottle neck where I need to wait for few seconds to have the result is when I start typing a word so the parsed query looks like i.e. "b%" or "{bar} & f%".

How can I make the searching faster and more responsive?

Maybe I can wait and search after the user typed at least three letters. Or I can use the compiler hint /*+ FIRST_ROWS(10) */ but it is worth since I use the BULK COLLECT and should I use statistics in that case?


Solution

  • The problem was in the wildcard % after the last word which created the performance issues that I described.

    I solved it by waiting for the user to input at least two chars and then instead of the wildcard I used a better performing fuzzy operator.

    Another possible solution which to be honest I didn't try is to create a prefix index to face problems with the wildcard %, since it is highly optimized for left-truncated search (i.e. bar%) as the documentations says.