oracle-databaseoracle-textstem

Spanish stemming in Oracle Text


I'm trying to create an Oracle Text index to make FTS queries on some text columns in Spanish in the database. According to Oracle docs I need to create a LEXER and a WORDLIST to enable stem and fuzzy queries:

exec ctxsys.ctx_ddl.create_preference ('cust_lexer','BASIC_LEXER');
exec ctxsys.ctx_ddl.set_attribute ('cust_lexer','base_letter','YES');
exec ctxsys.ctx_ddl.set_attribute ('cust_lexer','index_stems','SPANISH');
exec ctxsys.ctx_ddl.create_preference('cust_wordlist','BASIC_WORDLIST');
exec ctxsys.ctx_ddl.set_attribute('cust_wordlist','stemmer','AUTO');
exec ctxsys.ctx_ddl.set_attribute('cust_wordlist','fuzzy_match','AUTO');

And then create the index using those preferences:

CREATE INDEX NOMBREACCION_CTX ON ACCION(NOMBRE_ACCION) INDEXTYPE IS CTXSYS.CONTEXT parameters ('LEXER cust_lexer WORDLIST cust_wordlist');

When I run a query using stem operator ($) I get the following error:

ORA-20000: Oracle Text error:
DRG-00100: internal error, arguments : [50935],[drpn.c],[1113],[],[]
DRG-00100: internal error, arguments : [50935],[drpnw.c],[651],[],[]
DRG-00100: internal error, arguments : [51002],[drwa.c],[597],[],[]
DRG-00100: internal error, arguments : [51029],[drwas.c],[498],[ACCION],[]
DRG-51023: stemmer file cannot be opened
20000. 00000 -  "%s"
*Cause:    The stored procedure 'raise_application_error'
           was called which causes this error to be generated.  
*Action:   Correct the problem as described in the error message or contact
           the application administrator or DBA for more information.

According to Oracle docs stem feature should work for Spanish: http://docs.oracle.com/cd/B28359_01/text.111/b28304/amultlng.htm#CCREF2294

Also, this doesn't seem to be a missing feature in Oracle XE: http://docs.oracle.com/cd/E17781_01/doc.112/e21743/toc.htm#XERDM105

If i change 'SPANISH' for 'ENGLISH' it works OK. Has anyone managed to setup Spanish stemming in Oracle Text?


Solution

  • After some research, I found that Spanish stemming works OK in full Oracle Installations. For stemming, Oracle requires a dictionary that is not available in Oracle XE installations. Only English and Japanese dictionaries are installed with Oracle XE.