My Oracle Support Banner

Not all Expected Content is Crawled due to SES SAXParseException (Doc ID 2054231.1)

Last updated on SEPTEMBER 21, 2023

Applies to:

Oracle WebCenter Content - Version 11.1.1.8.0 and later
Information in this document applies to any platform.

Symptoms

SES crawls are not crawling and indexing the expected amount of content in the Webcenter Content datafeed.

As an example, the WCC datafeed snapshot added 498 items but the crawl only reports 273 discovered/processed.

The SES crawl log shows the following error:

[2015-08-13T13:59:11.791-04:00] [NOTIFICATION] [] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] submitting doc...idcplg?IdcService=GET_FILE&dDocName=UCM_CLUSTER1000256&allowInterrupt=1&Rendition=primaryFile&RevisionSelectionMethod=latestReleased
[2015-08-13T13:59:11.791-04:00] [TRACE:16] [EQG-30309] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] [SRC_CLASS: oracle.search.crawler.WebCrawler] [SRC_METHOD: processingPage]  Processing idcplg?IdcService=GET_FILE&dDocName=UCM_CLUSTER1000256&allowInterrupt=1&Rendition=primaryFile&RevisionSelectionMethod=latestReleased
[2015-08-13T13:59:11.792-04:00] [TRACE:16] [] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] [SRC_CLASS: oracle.search.crawler.URLAccess] [SRC_METHOD: processUrlEntry] doc owner (guid) =null
[2015-08-13T13:59:11.841-04:00] [TRACE:16] [EQG-40500] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] [SRC_CLASS: oracle.search.crawler.URLAccess] [SRC_METHOD: processDocBody]  Filtering document "idcplg?IdcService=GET_FILE&dDocName=UCM_CLUSTER1000256&allowInterrupt=1&Rendition=primaryFile&RevisionSelectionMethod=latestReleased"(URL ID = 198202)
[2015-08-13T13:59:12.149-04:00] [ERROR] [] [tid: Thread-19] [ecid: 0000KwcCXUV8Tsb6TJf9EO1LnD_z000007,0] EQP-60303: Exiting saxthread due to errors
[2015-08-13T13:59:12.150-04:00] [ERROR] [] [tid: Thread-19] [ecid: 0000KwcCXUV8Tsb6TJf9EO1LnD_z000007,0] [[
org.xml.sax.SAXParseException: <Line 17292, Column 17>: XML-20201: (Fatal Error) Expected name instead of <.
at oracle.xml.parser.v2.XMLError.flushErrorHandler(XMLError.java:422)
at oracle.xml.parser.v2.XMLError.flushErrors1(XMLError.java:287)
at oracle.xml.parser.v2.XMLReader.scanNameChars(XMLReader.java:1240)
at oracle.xml.parser.v2.XMLReader.scanQName(XMLReader.java:2069)
at oracle.xml.parser.v2.NonValidatingParser.parseAttr(NonValidatingParser.java:1733)
at oracle.xml.parser.v2.NonValidatingParser.parseAttributes(NonValidatingParser.java:1682)
at oracle.xml.parser.v2.NonValidatingParser.parseElement(NonValidatingParser.java:1523)
at oracle.xml.parser.v2.NonValidatingParser.parseRootElement(NonValidatingParser.java:409)
at oracle.xml.parser.v2.NonValidatingParser.parseDocument(NonValidatingParser.java:355)
at oracle.xml.parser.v2.XMLParser.parse(XMLParser.java:226)
at oracle.xml.jaxp.JXSAXParser.parse(JXSAXParser.java:292)
at oracle.search.plugin.rss.SAXThread.run(SAXThread.java:183)
at java.lang.Thread.run(Thread.java:682)

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.