Unlike Basis (used in all languages in InQuira v8.4.x and prior as well as most locales in OK v8.5.x) , new OLT used in OKv8.6x no longer split compound words into multiple tokens/stems
(Doc ID 2151974.1)
Last updated on JULY 19, 2016
Applies to:Oracle Knowledge - Version 8.6 and later
Information in this document applies to any platform.
- In old InQuira/OK releases that used the 3rd-party BASIS technology, the compound-words (which are very common in some languages like Dutch and German) were being tokenized as multiple tokens/stems with the RegexTokenizer feature. So, they can get large set of search results that they may see and prioritize.
- In new OK v8.6.0 release that uses the new OLT (Oracle Language Technologies), those compound-words are being tokenized as single but not multiple tokens/stems because the RegexTokenizer feature is disabled by default for performance concern.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document