My Oracle Support Banner

Oracle Text and UCM - Stop Word Management (Doc ID 870122.1)

Last updated on MARCH 05, 2024

Applies to:

Oracle WebCenter Content - Version 10.0 to 12.2.1.3.0 [Release 10gR3 to 12c]
Information in this document applies to any platform.




Goal

Stop-words are used within Oracle database indexes that use Oracle Text to control excessive index population by common words that appear in nearly every full text document. 

Stop words are words that Oracle Text will not index when processing extracted text from UCM documents.  The default stoplist that gets installed with Oracle Text has more than 100 words defined. These are common words and common parts of speech, such as articles, conjunctions, prepositions, linking verbs, and adverbs. The reason these words are not indexed as valid tokens is because when Oracle Text indexes full text, these words are nearly always present. What happens then is that the index may contain a reference to nearly every row in the base table, making it bloated and ultimately slow query times.

For example, the word “a” is likely to appear in nearly every document checked into UCM.  Were this word indexed, a search string that included the word "a" would produce a long list of documents.  To think of this in terms of a book index, the word “a” would likely need a page reference for every page and paragraph of text, and it would not be useful in searching and finding valuable content in any case. Likewise, in Oracle Text, stoplists are designed to make the “tokens” that get indexed more useful and meaningful.

 

The examples in this note were used on a UCM environment using OracleTextSearch as the search indexer engine.  (SearchIndexerEngineName=OracleTextSearch)  However, much of this also applies when using DATABASE.FULLTEXT since this option also creates CONTEXT indexes on the database that are affected by the same stoplists and stopwords. 

Solution

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Goal
Solution
 Reading the default stoplist and stopwords from the database
 Add a stopword
 
Remove a stopword
 Create a custom stoplist for use with UCM indexing
 Create the custom stoplist
 Using the custom stoplist as the database default for Oracle Text indexes
 
Using the custom stoplist only for UCM index creation
 

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.