Not all Expected Content is Crawled due to SES SAXParseException
(Doc ID 2054231.1)
Last updated on SEPTEMBER 21, 2023
Applies to:
Oracle WebCenter Content - Version 11.1.1.8.0 and laterInformation in this document applies to any platform.
Symptoms
SES crawls are not crawling and indexing the expected amount of content in the Webcenter Content datafeed.
As an example, the WCC datafeed snapshot added 498 items but the crawl only reports 273 discovered/processed.
The SES crawl log shows the following error:
[2015-08-13T13:59:11.791-04:00] [NOTIFICATION] [] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] submitting doc...idcplg?IdcService=GET_FILE&dDocName=UCM_CLUSTER1000256&allowInterrupt=1&Rendition=primaryFile&RevisionSelectionMethod=latestReleased
[2015-08-13T13:59:11.791-04:00] [TRACE:16] [EQG-30309] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] [SRC_CLASS: oracle.search.crawler.WebCrawler] [SRC_METHOD: processingPage] Processing idcplg?IdcService=GET_FILE&dDocName=UCM_CLUSTER1000256&allowInterrupt=1&Rendition=primaryFile&RevisionSelectionMethod=latestReleased
[2015-08-13T13:59:11.792-04:00] [TRACE:16] [] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] [SRC_CLASS: oracle.search.crawler.URLAccess] [SRC_METHOD: processUrlEntry] doc owner (guid) =null
[2015-08-13T13:59:11.841-04:00] [TRACE:16] [EQG-40500] [tid: crawler_2] [ecid: 0000KwcCYKk8Tsb6TJf9EO1LnD_z00000A,0] [SRC_CLASS: oracle.search.crawler.URLAccess] [SRC_METHOD: processDocBody] Filtering document "idcplg?IdcService=GET_FILE&dDocName=UCM_CLUSTER1000256&allowInterrupt=1&Rendition=primaryFile&RevisionSelectionMethod=latestReleased"(URL ID = 198202)
[2015-08-13T13:59:12.149-04:00] [ERROR] [] [tid: Thread-19] [ecid: 0000KwcCXUV8Tsb6TJf9EO1LnD_z000007,0] EQP-60303: Exiting saxthread due to errors
[2015-08-13T13:59:12.150-04:00] [ERROR] [] [tid: Thread-19] [ecid: 0000KwcCXUV8Tsb6TJf9EO1LnD_z000007,0] [[
org.xml.sax.SAXParseException: <Line 17292, Column 17>: XML-20201: (Fatal Error) Expected name instead of <.
at oracle.xml.parser.v2.XMLError.flushErrorHandler(XMLError.java:422)
at oracle.xml.parser.v2.XMLError.flushErrors1(XMLError.java:287)
at oracle.xml.parser.v2.XMLReader.scanNameChars(XMLReader.java:1240)
at oracle.xml.parser.v2.XMLReader.scanQName(XMLReader.java:2069)
at oracle.xml.parser.v2.NonValidatingParser.parseAttr(NonValidatingParser.java:1733)
at oracle.xml.parser.v2.NonValidatingParser.parseAttributes(NonValidatingParser.java:1682)
at oracle.xml.parser.v2.NonValidatingParser.parseElement(NonValidatingParser.java:1523)
at oracle.xml.parser.v2.NonValidatingParser.parseRootElement(NonValidatingParser.java:409)
at oracle.xml.parser.v2.NonValidatingParser.parseDocument(NonValidatingParser.java:355)
at oracle.xml.parser.v2.XMLParser.parse(XMLParser.java:226)
at oracle.xml.jaxp.JXSAXParser.parse(JXSAXParser.java:292)
at oracle.search.plugin.rss.SAXThread.run(SAXThread.java:183)
at java.lang.Thread.run(Thread.java:682)
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |