postgresqlfull-text-search

Escaping hyphens in tsquery (postgresql)


I need to create a tsquery based on the string of the following format:

something-smth-somthing-etc-etc

Calling to_tsquery('something-smth-somthing-etc-etc') returns:

'something-smth-somthing-etc-etc' & 'someth' & 'smth' & 'somth' & 'etc' & 'etc'

Clearly, the string undergoes tokenization, stemming etc. But in our case, the column on which we are making the FTS, already contains tsvector which consists of a single lexeme: 'something-smth-somthing-etc-etc'.

The query select * from sometable where searchee @@ to_tsquery('something-smth-somthing-etc-etc') returns no results.

How can I call to_tsquery, so it will not analyze the provided string and create a single lexeme query?

Or am I missing something more major here?


Solution

  • If you have a tsvector with the mentioned value, it was not processed but just inserted as tsvector type:

    t=# select to_tsvector('something-smth-somthing-etc-etc'), 'something-smth-somthing-etc-etc'::tsvector;
                                        to_tsvector                                    |             tsvector
    -----------------------------------------------------------------------------------+-----------------------------------
     'etc':5,6 'smth':3 'something':2 'something-smth-somthing-etc-etc':1 'somthing':4 | 'something-smth-somthing-etc-etc'
    (1 row)
    

    Then indeed the condition would return false:

    t=# select 'something-smth-somthing-etc-etc'::tsvector @@ to_tsquery('something-smth-somthing-etc-etc');
     ?column?
    ----------
     f
    

    To hack it, you can skip processing on tsquery as well:

    t=# select 'something-smth-somthing-etc-etc'::tsvector @@ 'something-smth-somthing-etc-etc'::tsquery;
     ?column?
    ----------
     t