Why Slow Context Index Creation With Russian INDEX_STEMS than with English INDEX_STEMS? (Doc ID 1372409.1)

Last updated on FEBRUARY 08, 2017

Applies to:

Oracle Text - Version 11.2.0.1 to 11.2.0.2 [Release 11.2]
Information in this document applies to any platform.

Goal

Why context index creation with RUSSIAN index_stems is 10 to 30 times slower than with ENGLISH index_stems, as shown below in testcase?

SQL> select count(*) from my_test_table;

  COUNT(*)
----------
     56146

SQL> begin
 ctx_ddl.create_preference('RUSSIAN_LEXER', 'BASIC_LEXER');
 ctx_ddl.set_attribute('RUSSIAN_LEXER', 'INDEX_STEMS','RUSSIAN');

 ctx_ddl.create_preference('ENGLISH_LEXER', 'BASIC_LEXER');
 ctx_ddl.set_attribute('ENGLISH_LEXER', 'INDEX_STEMS','ENGLISH');
 end;
/

PL/SQL procedure successfully completed.

SQL> set timing on
SQL> select to_char(sysdate,'DD-MM-YYYY HH24:MI:SS') from dual;

TO_CHAR(SYSDATE,'DD
-------------------
21-10-2011 13:12:55

Elapsed: 00:00:00.01

SQL> CREATE INDEX ctx_my_text ON my_test_table(mytext) INDEXTYPE IS ctxsys.context PARAMETERS ('LEXER ENGLISH_LEXER');

Index created.

Elapsed: 00:00:07.64

SQL> select to_char(sysdate,'DD-MM-YYYY HH24:MI:SS') from dual;

TO_CHAR(SYSDATE,'DD
-------------------
21-10-2011 13:13:14

Elapsed: 00:00:00.00

SQL> drop index ctx_my_text;

Index dropped.

Elapsed: 00:00:01.26

SQL>
SQL> select to_char(sysdate,'DD-MM-YYYY HH24:MI:SS') from dual;

TO_CHAR(SYSDATE,'DD
-------------------
21-10-2011 13:13:49

Elapsed: 00:00:00.00

SQL> CREATE INDEX ctx_my_text ON my_test_table(mytext) INDEXTYPE IS ctxsys.context PARAMETERS ('LEXER RUSSIAN_LEXER');

Index created.

Elapsed: 00:05:46.20

SQL> select to_char(sysdate,'DD-MM-YYYY HH24:MI:SS') from dual;

TO_CHAR(SYSDATE,'DD
-------------------
21-10-2011 13:19:5

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms