My Oracle Support Banner

Single Numeric Characters Are Not Stored In The Oracle Text (dr$<indexName>$i) table with startjoins or endjoins when they are defined in the lexer (Doc ID 2665739.1)

Last updated on APRIL 17, 2023

Applies to:

Oracle Text - Version 12.1.0.2 and later
Information in this document applies to any platform.

Symptoms

Single numeric characters are not stored in the Oracle Text token table (dr$<indexName>$i) with startjoins or endjoins when they are defined in the lexer.  
This causes results similar to prepending/appending a wildcard ‘%’ character to the search term.

For example
create table idxTest
(
 id   number not null,
 term varchar2(100) not null,
 idxTerm varchar2(100)
);

insert into idxTest (id, term) values (1,'BUILDING ENGINEERING 1');
insert into idxTest (id, term) values (3,'BUILDING ENGINEERING 1A');
insert into idxTest (id, term) values (2,'BUILDING ENGINEERING 1 (08-119)');
insert into idxTest (id, term) values (4,'BUILDING ENGINEERING 1 TEST');
insert into idxTest (id, term) values (5,'BUILDING ENGINEERING 1@');
insert into idxTest (id, term) values (6,'BUILDING ENGINEERING 1TEST');
insert into idxTest (id, term) values (7,'BUILDING_ENGINEERING_1_TEST');
insert into idxTest (id, term) values (8,'BUILDING_ENGINEERING1TEST');
insert into idxTest (id, term) values (9,'BUILDING ENGINEERING 1TEST 1');
insert into idxTest (id, term) values (10,'BUILDING ENGINEERING TESTING');
insert into idxTest (id, term) values (11,'BUILDING ENGINEERING 1-');
insert into idxTest (id, term) values (12,'BUILDING ENGINEERING 10');
insert into idxTest (id, term) values (13,'(BUILDING ENGINEERING 1)');
update idxTest set idxTerm = '{'||term||'}';
commit;  

EXEC CTX_DDL.DROP_PREFERENCE ('TLEXER');
exec ctx_ddl.create_preference('TLEXER','BASIC_LEXER');
exec ctx_ddl.set_attribute('TLEXER','startjoins','{');
exec ctx_ddl.set_attribute('TLEXER','endjoins','}');
exec ctx_ddl.set_attribute('TLEXER','printjoins','-&()@#');
exec ctx_ddl.set_attribute('TLEXER','punctuations',',.?!');
exec ctx_ddl.set_attribute('TLEXER','index_themes','NO');
exec ctx_ddl.set_attribute('TLEXER','index_text','YES');
exec ctx_ddl.set_attribute ('TLEXER', 'index_stems', 'NONE');

create index idxTest_idx on idxTest(idxTerm) indextype is ctxsys.context parameters ('lexer TLEXER');


Searching this table for 'BUILDING ENGINEERING 1’  will return ID = 1 but also ID = 2 , 4,
and 7 where ID = 1 is the only row expected.

Executing the query:

SELECT SCORE(1), id, term, idxTerm from idxtest WHERE CONTAINS(idxTerm,
'(\{BUILDING ENGINEERING 1\})', 1) > 0 ORDER BY SCORE(1) DESC;
Returns:

 SCORE(1)         ID TERM                                                                             IDXTERM
---------- ---------- -------------------------------------------------------------------------------- --------------------------------------------------------------------------------
        4          1 BUILDING ENGINEERING 1                                                          {BUILDING ENGINEERING 1}
        4          7 BUILDING_ENGINEERING_1_TEST                                                 {BUILDING_ENGINEERING_1_TEST}
        4          4 BUILDING ENGINEERING 1 TEST                                                  {BUILDING ENGINEERING 1 TEST}
        4          2 BUILDING ENGINEERING 1 (08-119)                                             {BUILDING ENGINEERING 1 (08-119)}

Expected:

 SCORE(1)         ID TERM                                                                             IDXTERM
---------- ---------- -------------------------------------------------------------------------------- --------------------------------------------------------------------------------
        5          1 BUILDING ENGINEERING 1                                                           {BUILDING ENGINEERING 1}

Querying dr$idxTest_idx$i returns:

TOKEN_TEXT                                                                        TOKEN_TYPE TOKEN_FIRST  TOKEN_LAST
TOKEN_COUNT TOKEN_INFO
-------------------------------------------------------------------------------- ----------- ----------- ----------- ----------- ----------
(08-119)}                                                                                  0           3           3           1 <BLOB>
1                                                                                          0           1           9           5 <BLOB>
1)}                                                                                        0          13          13           1 <BLOB>
1-}                                                                                        0          11          11           1 <BLOB>
10                                                                                         0          12          12           1 <BLOB>
1@}                                                                                        0           5           5           1 <BLOB>
1A}                                                                                        0           2           2           1 <BLOB>
1TEST                                                                                      0           9           9           1 <BLOB>
1TEST}                                                                                     0           6           6           1 <BLOB>
ENGINEERING                                                                                0           1          13          12 <BLOB>
ENGINEERING1TEST}                                                                          0           8           8           1 <BLOB>
TESTING}                                                                                   0          10          10           1 <BLOB>
TEST}                                                                                      0           4           7           2 <BLOB>
{(BUILDING                                                                                 0          13          13           1 <BLOB>
{BUILDING                                                                                  0           1          12          12 <BLOB>

Expected in the list is “1}” which isn’t present.


Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.