Load not evenly distributed among PX slaves in 11.2 (Doc ID 1224903.1)

Last updated on MARCH 27, 2014

Applies to:

Oracle Server - Enterprise Edition - Version: 11.2.0.1 and later   [Release: 11.2 and later ]
Information in this document applies to any platform.

Symptoms

Load distribution among slaves is fine in 10.2, gets a bit worse in 11.1 and is quite bad in 11.2, causing one slave gets most of the load while the other slaves get just few rows.

That causes performance issues, as one slave takes longer to finish (as per bigger load) and causes overall performance on the query is bigger. As we can check from v$pq_tqstat output.

* 10.2 load balance is quite good
DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES OPEN_TIME AVG_LATENCY
---------- ---------- ---------- ---------- ---------- ---------- -----------
WAITS TIMEOUTS PROCESS INSTANCE
---------- ---------- ---------- ----------
1 0 Producer 155 47720 0 0
154 1 P003 1

1 0 Producer 135 41568 0 0
134 2 P000 1

1 0 Producer 95 29264 0 0
94 1 P002 1

1 0 Producer 152 46808 0 0
151 1 P001 1

1 0 Consumer 537 165360 0 0
449 204 QC 1

* 11.1 is not as good, but still balanced somehow:
DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES OPEN_TIME AVG_LATENCY
---------- ---------- ---------- ---------- ---------- ---------- -----------
WAITS TIMEOUTS PROCESS INSTANCE
---------- ---------- ---------- ----------
1 0 Producer 145 44656 0 0
140 4 P002 1

1 0 Producer 100 30808 0 0
99 3 P000 1

1 0 Producer 225 69288 0 0
221 3 P003 1

1 0 Producer 67 20632 0 0
68 2 P001 1

1 0 Consumer 537 165384 0 0
598 170 QC 1


But on 11.2, the load is really skewed:
DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES OPEN_TIME AVG_LATENCY
---------- ---------- ---------- ---------- ---------- ---------- -----------
WAITS TIMEOUTS PROCESS INSTANCE
---------- ---------- ---------- ----------
1 0 Producer 11 3368 0 0
2 0 P002 1

1 0 Producer 51 15528 0 0
6 0 P000 1

1 0 Producer 464 141272 0 0
25 2 P001 1

1 0 Producer 11 3368 0 0
2 0 P003 1

1 0 Consumer 537 163536 0 0
159 89 QC 1


 

Changes

The primary change between 11.2 and 11.1 is that in 11.2 we do granule batching. That is when the slaves request granules, the QC provides them a set of granules instead of a single granule to work on. 
The parameter controlling this feature is _px_granule_batch_size with default value of 10 i.e. 10 granules are given to slaves on one request.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms