Crawler Ignoring Robots.txt File (Doc ID 1667408.1)

Last updated on JULY 01, 2016

Applies to:

Oracle Commerce Guided Search / Oracle Commerce Experience Manager - Version 3.0.2 and later
Information in this document applies to any platform.

Symptoms

With setting similar to the following in a sites robots.txt, the CAS crawler ignores the Disallow rule and crawls the listed URL. Consequently content related to the crawled URL is included in the index.

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms