Understanding Disk Space Usage With Incremental Crawls / Partial Updates

(Doc ID 2349807.1)

Last updated on JANUARY 26, 2018

Applies to:

Oracle Commerce Guided Search / Oracle Commerce Experience Manager - Version 11.2 and later
Information in this document applies to any platform.

Goal

We have a CAS-only ETL process. When we run baselines, the total storage used by all of the Endeca stuff on our ITL server sits at about 70 Gb. The majority of this space is taken up by these directories:

CAS/state/CAS-CRAWL-MyCompany (our data crawl name)
apps/MyApp/data (our Endeca app folder)

When we run partials for awhile without running baseline, we notice that disk utilization goes up dramatically. It climbs up to 100 Gb. .

We want to understand why this happens and if there is anything we can do to mitigate it. We've tried to search through the CAS documents, the support KB, and the discussion forums to find out why this increase happens, but haven't found anything that answers this.
 

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms