Master Note for Oracle XML Database (XDB) Performance
(Doc ID 1407946.1)
Last updated on JUNE 03, 2019
Applies to:
Oracle Database Cloud Schema Service - Version N/A and laterOracle Database Exadata Cloud Machine - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Cloud Service - Version N/A and later
Oracle Database - Enterprise Edition - Version 9.2.0.3 and later
Information in this document applies to any platform.
***** Checked for Relevance on 23-Jan-2016 *****
Details
You can create indexes on your XML data, to focus on particular parts of it that you query often and thus improve performance. This master note provides information for DBA's on tuning queries with XML Database (XDB).
Actions
BEST PRACTICES
This note does examples with both XQuery and XMLType functions ExistsNode, Extract and ExtractValue It is recommended from 10.2.0.1 onwards to always use XQuery for the following reasons:
- XQuery can be used from any data source, wheras SQL/XML Functions cannot be used to query external data sources.
- In addition, to querying XML data, XQuery can construct/generate XML data.
- XQuery supports a native XQuery engine that can compile expressions into the same internal algebra used for SQL statements. Therefore, this improves the performance of XQuery expressions contained in XMLQuery,XMLTable, XMLExists and XMLCast operators.
- It should also be noted that ExistsNode, Extract and ExtractValue from 11.2.0.1, can and should be replaced with newly introduced XQuery standard functions. Although still supported there will be no enhancements in the future for the old and replaced functions. (<Document 1072039.1>) <Replacement of Oracle Proprietary XML Functions with XQuery Standard Functions> discusses the recommendation to use XQuery instead of ExistsNode, Extract and ExtractValue with detailed examples.
- When coding with 'Order by' when using XMLAgg (), its recommended to write the Order by into the regular relational query instead of putting 'Order by with XMLAgg()'. Therefore, the query would need have all 'Order by' commands placed into outer query block instead of using 'XMLAgg Order by' directly. This will lead to improved performance. For example, as shown in the query below:
-
- Query with 'Order by' using XMLAgg():
XMLElement("Element_Name",
( SELECT XMLAgg(XMLContent
ORDER BY XML_Column
)--XMLAgg
FROM XML_Table_Name WHERE Column_B = 'Y'
-
- Query with 'Order by' pushed to outside/relational query:
XMLElement("Element_Name"
( SELECT XMLAgg(XMLContent
-- ORDER BY XML_Column <== remove the clause from XMLAgg()
)--XMLAgg
FROM XML_Table_Name WHERE Column_B = 'Y''
ORDER BY XML_Column -- <== push the order into the regular query
-
- If a query is written with 'XMLAgg() order by' then it will require lots of disk space to hold temporary workspace which could ultimately lead to the following issues:
-
- ORA-4030 and out of memory errors
- For deeply nested XMLAgg() with 'Order by' its reported to have caused memory corruption issues like ORA-7445 ([kghtshrt], [kghprmalo], [kghalf],[kgh_subtract_extent_chunks]) and ORA-600 ([17114], [kghGetHpSz1]) errors.
-
- If a query is written with 'XMLAgg() order by' then it will require lots of disk space to hold temporary workspace which could ultimately lead to the following issues:
TYPES OF STORAGE:
It is recommended to store XML content in the XMLType datatype. Otherwise, if storing an XML document in a CLOB or VARCHAR2 column then the database has no idea that its managing XML content. The descriptions of storage listed below, explain the various storage options in the XMLType datatype, which is XML aware so this means that it can process XML data efficiently. There are benefits and disadvantages for each storage type, so the decision which is the most effective for your company will depend on your needs and data content/structure. Typically, if you are using Oracle 11g or higher, binary XML storage is the closest thing to a universal storage model for XML data, that can be used effectively for a very wide range of use cases, from document-centric to data-centric.
The following link provides detailed description of each storage type, the advantages and disadvantages for each storage option and also most importantly, guides you how to choose the best storage option, given your use case:
http://www.oracle.com/technetwork/database/features/xmldb/xmlchoosestorage-v1-132078.pdf
CLOB Based Storage
- When XML is stored as a CLOB (also known as Unstructured storage and text-based persistence), there are limited performance benefits. Prior to 11.2.0.2, the default storage type for an XMLType table or XMLType column is CLOB. As we know, CLOB is always non-schema based. From 11.2.0.2, onwards, binary XML became the default. Whenever CLOB based storage is used with xml content, performance is not optimized because whenever we do any XML processing with a CLOB, we have to parse the entire XML into the in-memory DOM and then extract the needed information, which is very CPU and memory intensive. This is known as the functional evaluation model which involves building a tree like representation of the XML called DOM and then traversing through the tree to evaluate the XPath. Oracle XML DB must parse the entire XML document and load it into an in-memory DOM structure before any validation, XSL Transformation, or XPath operations can be performed on it.When working with CLOB, there are a lot of type and encoding conversions as its not XML content aware. The larger the document the more CPU/Memory usage has to take place for an XMLType stored as a CLOB.
- In additional, CLOB based storage keeps all whitespace which results in more xml data to be parsed which can be significant with a large amount of XML content been used, whereas the other storage options remove unneeded whitespace so this is not an issue. Only when we need to process a very small amount of data out of an XML document and infrequently, then CLOB based storage can be used.
- When documents are stored using unstructured storage (in a CLOB), functional evaluation is necessary any time the Extract(), ExtractValue(), ExistsNode() and UpdateXML() operators are used unless an XMLIndex, Oracle Text index or function-based index can be used to resolve the query. CLOB based storage can attempt to use XMLIndex, Oracle Text or function based indexes on certain XPath's to improve performance.
Object Relational Storage
- Prior to 11g, Object Relational Storage (O/R) (also known as Structured storage or Object-based persistence), is recommended for storing XML content rather than CLOB storage. As we know, O/R is always schema based. XML documents are broken down into object relational tables, types and views. With this method, xml content is stored as a Schema Based XMLType and internally performance of most types of queries are optimized.Since O/R involves reformatting XML content into object relational rows and columns which allows the optimizer a chance to rewrite SQL commands (known as query rewrite which is enabled by default from 10g onwards) therefore resulting in better performance on returning XML fragments. In other words, since XDB translates XPath expressions into convention SQL statements, the optimizer can process the re-written SQL statement like any other SQL statement. There are significant performance improvements as there is little overhead with query-rewrites and therefore its not uncommon for the timing to be similar to regular relational operations. Providing that an XPath expression can be referenced to an underlying SQL object, then query rewrite can take place and would improve performance. If there is an index on the column and the optimizer selects query re-write then the XPath will be evaluated against the XML document without having to ever construct the XML document in memory.
- Certain XPath expressions are not rewritten. For example, storage of complex types using CLOBs, recursive type definitions, union operations and XPath variable references are not rewritten. To create a trace file to determine why query rewrite didn't take place, set the following trace and review from the user_dump_dest:
SQL> alter session set events '19027 trace name context
forever, level 8192'
- Even when an structured data cannot be re-written or an index cannot be used, it doesn't load the entire DOM into memory like CLOB for any XPath operation. There is an internal in-memory structure called XML Object (XOB), which doesn't load duplicate tags and only loads nodes that are needed to be processed resulting in lower overhead and better performance.
- When structured storage is selected, collections (elements which have maxOccurs 1, allowing them to appear multiple times) are mapped into SQL VARRAY values. This would result in poor performance as the entire contents on the VARRAY is serialized uting a single LOB column. It is much more efficient to stored the members of the collection as a nested table and this would be done by adding an explicit VARRAY STORE AS clause for each collection in the create table command. If you wanted to force all collections to be stored as nested tables (Ordered Collections in Tables known as OCT), then you can specify in the root element of the XML Schema for storeVarrayAsTable to be set to true (was set by default to false prior to 11g). Then its possible to index each nested table for faster retrieval. To determine the name of the nested tables, you can use the query below and then create a b*tree index on these nested tables:
SQL> SELECT TABLE_NAME, PARENT_TABLE_COLUMN FROM
USER_NESTED_TABLES
WHERE PARENT_TABLE_NAME = '<YOUR_BASE_TABLE_NAME>';
- The optimizer is XML schema aware and the re-writing of XMLType operations (known as query rewrite) into object-relational SQL statements results in significant performance improvements compared with performing the same operations against XML documents stored using unstructured storage.
- When documents are stored using structured storage (Object-Relational), it is possible to use Oracle Text indexes, function-based indexes or conventional B*tree/bitmap indexes on any underlying SQL types.
Binary XML Storage
- Binary XML (also known as post-parse persistence) is stored in a binary format so doesn't need to be parsed which removes an internal layer that results in better performance on querying over the other storage methods. Also when loading XML, binary XML is also efficient when Securefile LOB's are used even though it does full validation of XML. It can be schema-based on non-schema based showing its flexibility.
- When documents are stored using binary XML, it is possible to use XMLIndex, Oracle Text indexes and function-based indexes. The optimizer also can use single-pass streaming also when deciding on the execution plan which does a single scan of the binary XML data and evaluates its cost versus other methods. Whenever using Binary XML it it typically recommended to have both Securefile setup and XMLIndex created.
TYPES OF INDEXES:
Function Based Indexes
- Can be used on all different XDB storage types. On structured data where XPath re-write cannot process the XPath expression supplied as part of the create index statement, then the statement will result in a function based index being created. If it can wherever possible, it will rewrite the function based index into a b*tree index directly on the columns that manage the attributes of the underlying SQL objects.
- You can speed up queries by building function-based indexes on SQL functions XMLQuery, XMLExists, XMLCast, Extract, ExtractValue and ExistsNode.
- When using function-based indexes it must not target a collection, the result must only occur once in the XML document. In other words, the XPath expression must only target scalar data or singleton nodes. For example, if an FBI on an extract() that returns multiple nodes, it will not be very useful if the nodes are concatenated rather than individual nodes (which in the case of a collection would be on nested tables whereby each node is stored as a separate row).
- One thing to bear in mind when creating and using function-based indexes is that the optimizer will only consider using the index when the function included in the WHERE clause is identical to the function used to create the index. In other words, a function based index on ExtractValue(), will not use the index for an ExtractValue() used in the where clause.
- For example speeding up ExtractValue():
SQL> CREATE TABLE po_xml_tab (
2 poid number(10),
3 poDoc XMLType NOT NULL);
Table created.
SQL> INSERT INTO po_xml_tab
2 VALUES(100, XMLType('<?xml version="1.0"?>
3 <PO pono="1">
4 <PNAME>Po_1</PNAME>
5 <CUSTNAME>John</CUSTNAME>
6 <SHIPADDR>
7 <STREET>1033, Main Street</STREET>
8 <CITY>Sunnyvalue</CITY>
9 <STATE>CA</STATE>
10 </SHIPADDR>
11 </PO>'));
1 row created.
SQL> commit;
Commit complete.
SQL> DESC po_xml_tab
Name Null? Type
----------------------------------------- -------- ----------------------------
POID NUMBER(10)
PODOC NOT NULL PUBLIC.XMLTYPE STORAGE BINARY
SQL> set timing on
SQL> set autotrace traceonly explain
SQL> SELECT * from po_xml_tab
2 where ExtractValue (podoc, '//CITY/text()') = 'Sunnyvalue'
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2015 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| PO_XML_TAB | 1 | 2015 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
SQL> CREATE INDEX city_FBI_index on po_xml_tab
2 (ExtractValue(podoc, '//CITY/text()'))
Index created.
SQL> SELECT * from po_xml_tab
2 where ExtractValue(podoc, '//CITY/text()') = 'Sunnyvalue'
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2015 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| PO_XML_TAB | 1 | 2015 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | CITY_FBI_INDEX | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
For example speeding up existsnode():
SQL> SELECT * FROM po_xml_tab e
2 WHERE e.poDoc.ExistsNode('//SHIPADDR') = 1;
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2015 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| PO_XML_TAB | 1 | 2015 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
SQL> CREATE INDEX po_index ON po_xml_tab
2 (poDoc.existsNode('//SHIPADDR'));
Index created.
SQL> SELECT * FROM po_xml_tab e
2 WHERE e.poDoc.ExistsNode('//SHIPADDR') = 1;
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2015 | 2 (0)|
00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| PO_XML_TAB | 1 | 2015 | 2 (0)|
00:00:01 |
|* 2 | INDEX RANGE SCAN | PO_INDEX | 1 | | 1 (0)|
00:00:01 |
--------------------------------------------------------------------------------
For example speeding up XMLEXISTS():
SQL> SELECT XMLQuery('//CITY' passing podoc returning content)
2 from po_xml_tab
3 where XMLExists('/PO/SHIPADDR[CITY="Sunnyvalue"]' passing podoc);
XMLQUERY('//CITY'PASSINGPODOCRETURNINGCONTENT)
-----------------------------------------------------------------------------
<CITY>Sunnyvalue</CITY>
SQL> CREATE INDEX my_po_city_index ON purchaseorder x (x.purchase_col_xml.XMLDATA."SHIPADDR"."CITY");
- Can either create a b*tree index on City text node for function based index on XMLExists with structured data, or XMLIndex on binary or BLOB based storage to improve index.
B*Tree/Bitmap Indexes
- Can be used on Object Relational and Unstructured XML data. They are used most effectively when they are created with structured XML on the underlying objects directly and they can be created on any node in the document.
- If Purchaseorder is XMLType table:
SQL> CREATE INDEX my_po_index ON purchaseorder x
(x.XMLDATA."Reference");
- If Purchaseorder is XMLType column named purchase_col_xml can be written as:
SQL> CREATE INDEX my_po_index
ON purchaseorder x
(x.purchase_col_xml.XMLDATA."Reference");
Oracle Text Indexes
- Can be used on all the different XDB storage types.
- Oracle Text indexes can only used to provide fast access to XML data text nodes. In other words, they index full-text strings only. It can be made more powerful by restricting paths of an XML document by focusing on structural XML Searching rather than searching on all text nodes.
- For example:
SQL> CREATE INDEX po_text_index ON
2 po_xml_tab(poDoc) indextype is ctxsys.context;
Index created.
SQL> SELECT * FROM po_xml_tab w
2 WHERE CONTAINS(w.poDoc, John) > 0;
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| PO_XML_TAB | 1 | 2027 | 4 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | PO_TEXT_INDEX | | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------
XMLIndex
- XMLIndex provides a general, XML-specific index that indexes the internal structure of XML data. You cannot use XMLIndex on structured XML storage. It can be used on binary XML and unstructured XML storage. It indexes the XML tags of the document and identifies document fragments basd on XPath expressions that target them. It can look up individual node values and also records the hierachy relationships for each node. When creating a default XMLIndex it will index all possible XPath expressions for all your XML data. There is also an option to only index certain paths that will be queried by using XMLIndex path-subsetting.
- Unlike other indexes, it can be used in any part of the query and is not limited to use in a where clause. It is not necessary ot understand the document structure of the XML data in order to use XMLIndex.
- XMLIndex is a domain index and therefore it also has additional components such as path, order and value indexes. It also has secondary indexes which are used to maintain the index and they can be manipulated in many cases to meet your needs.
- To know whether a particular XMLIndex index has been used in resolving a query, you can examine an explain plan of the query. The explain plan will not directly indicate that a domain index was used; it will not refer to the XMLIndex index by name. For example, below, you can see that the XMLIndex is used, as you can see references to the path table (SYS122252_PO_XML_TA_PATH_TABLE), value index (SYS122252_PO_XML_TA_VALUE_IX ), order key, or path id will be referenced in the explain plan.
SQL> CREATE INDEX po_xml_tab_xmlindex on po_xml_tab(podoc)
2 indextype is xdb.xmlindex;
Index created.
SQL> SELECT XMLQuery('/PO/SHIPADDR/STREET' passing
2 podoc returning content)
3 FROM po_xml_tab
4 WHERE xmlexists('/PO/SHIPADDR[STATE="CA"]' passing podoc)
5 /
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3835 | 4 (25)| 00:00:01 |
| 1 | SORT GROUP BY | | 1 | 3524 | | |
|* 2 | TABLE ACCESS BY INDEX ROWID | SYS122252_PO_XML_TA_PATH_TABLE | 1 | 3524 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SYS122252_PO_XML_TA_PIKEY_IX | 1 | | 1 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 3835 | 4 (25)| 00:00:01 |
| 5 | SORT UNIQUE | | 1 | 3823 | 2 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID| SYS122252_PO_XML_TA_PATH_TABLE | 1 | 3823 | 2 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | SYS122252_PO_XML_TA_VALUE_IX | 1 | | 1 (0)| 00:00:01 |
| 8 | TABLE ACCESS BY USER ROWID | PO_XML_TAB | 1 |
12 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1)
3 - access("SYS_P0"."RID"=:B1 AND "SYS_P0"."PATHID"=HEXTORAW('2767') )
6 - filter("SYS_P2"."PATHID"=HEXTORAW('3F73') AND "SYS_P2"."VALUE"='CA' AND
SYS_XMLI_LOC_ISNODE("SYS_P2"."LOCATOR")=1)
7 - access(SUBSTRB("VALUE",1,1599)='CA')
- If you see in the predicate information, the filter based on function SYS_CHECKACL function ensures that only documents the user has permission to access are included in the result set. In this case the minimum permission required is read-contents and this is an optimal plan for this query given that ACL based security is being enforced.
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("PURCHASEORDER"."SYS_NC00022$"='SBELL' AND
SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
xmlns="http://xmlns.oracle.com/xdb/acl.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
http://xmlns.oracle.com/xdb/acl.xsd DAV:http://xmlns.oracle.comb/dav.xsd"
- If you wish to disable ACL based security, this will improve the execution plan, however, be aware that this should only be done when the default tables generated by schema registration will not be used in conjunction with the Oracle XML DB repository. It is typically recommend to keep ACL based security enabled.
SQL> CALL dbms_xdbz.disable_hierarchy('<PUT_YOUR_SCHEMA_NAME_HERE>',
'PURCHASEORDER') /
UPDATEXML()
- When updating an XMLType that is based on the CLOB storage model, UpdateXML operations are implemented using the interfaces defined by the W3C DOM API. When the update is complete the contents of DOM are serialized as text and written back to the underlying CLOB.
- Object-relational storage allows certain UpdateXML operations to be re-written as SQL that operates directly on the underlying tables.
- The Binary XML storage model uses XML Index and the sliding insert feature Secure Files to optimize updateXML operations. Sliding inserts enable partial updates of the XML content.
OTHER GOTCHAS TO IMPROVE XML PERFORMANCE
- DML on CLOB Based Storage is not optimal since the entire documents needs to be parsed in memory whereas O/R allows for piece wise updates and therfore performance is excellent.
- Cost-based optimization determines the index or indexes to use, so that performance is maximized. Even if a index is defined, it doesn't mean it will be used unless the optimizer determines the index path to be more cost-effective. Therefore, always ensure the statistics of the table and index are kept up to-date so the statistics are kept current:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(
tabname => '<PUT_THE_TABLE_NAME>'
ownname => '<PUT_OWNER_OF_TABLE>',
cascade => 'TRUE');
OR at schema level:
SQL> EXEC dbms_stats.gather_schema_stats('<PUT_SCHEMA_HERE>');
- Don't use Pretty Printing unless this formatting option is needed/required.
- Typically in most cases, binary XML was designed with performance in mind.
- The XDB annotation xdb:MaintainDOM=false reduces overhead of parsing XML in memory using tree structure. However, when this parameter is set to false, it can result in DOM fidelity been lost.
- XMLExists will help narrow down data returned.
- Use Indexes whenever necessary to avoid full table scans (FTS).
- Functional evaluation is more expensive as it involves the entire XML content to be written into memory and processed whereas query rewrite is more efficient as it works directly against the object-relation structures associated with the XML.
- In Oracle 11gR2 11.2.0.2 release or later, we recommend that you use the SQL> SET XMLOPTIMIZATIONCHECK ON command to determine if parts of your query were not optimized. When it is ON, it will ensure that only XML queries or XML operations that were fully optimized will be executed. A suboptimal XML query or DML operation will be aborted with the following error message: "ORA 19022 - Unoptimized XML construct detected". In addition, the reason for the query or DML being suboptimal will be printed to the trace file in the user_dump_dest.
- In order to construct XML from relational data, there are several methods available. The package DBMS_XMLGEN will
tend to give poorer performance as it performs all operations in memory. It would be more preferable to use: - Using SQL/XML functions such as XMLElement, XMLForest, and XMLAgg.
- XMLQuery using the return clause to indicate the desired XML output.
- It is recommended to XMLSerialize since its optimized by the XDB and thus performs better rather than using getclobval(). In addition from 11.2.x onwards xmltype.getclobval function is deprecated anyway. For example:
Instead of using
l_clob:=resXml.getclobval();
Consider using:
SQL> select xmlserialize(content resXml as clob indent) into l_clob from dual;
ANALYZING THE EXECUTION PLAN
- When attempting to tune a XML query, use common query tuning procedures which includes getting an execution plan. Firstly look for full table scans in the operation section and if there are a large number of documents in the table/view, then either create an index or attempt to get the optimizer to use the index. Secondly, evaluate the "Predicate Information" section of the execution plan. Attempt to ensure that for each predicate it uses "access" paths (indicating its using an index). If you see the "filter" option is used this indicates that no query rewrite is happening. Instead functional evaluation is chosen meaning the DOM tree is used which is very expensive and usually results in poor performance in most cases.
For example, the example below means that no query rewrite is taking place:
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(CAST("PURCHASEORDER"."SYS_NC00021$" AS VARCHAR2(128))='Sarah
J. Bell' AND SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
xmlns="http://xmlns.oracle.com/xdb/acl.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
http://xmlns.oracle.com/xdb/acl.xsd DAV:http://xmlns.oracle.com/xdb/dav.xsd
"><read-properties/><read-contents/></privilege>'))=1)
- Therefore creating an index on the column function would allow direct access to the information rather than having to use functional evaluation. There are four different ways to create a b*tree index on the predicate-targeted column, which all effectively do the same thing:
SQL> CREATE INDEX requestor_index ON purchaseorder ("SYS_NC00021$");
/** Internal relation column name for requestor **/
OR
SQL> CREATE INDEX requestor_index ON purchaseorder
("XMLDATA"."REQUESTOR");
/** SQLName **/
OR
SQL> CREATE INDEX requestor_index ON purchaseorder
(cast("XMLDATA"."REQUESTOR" AS VARCHAR2(128)));
/** Function Based Index on cast () that matches the query **/
OR
SQL> CREATE INDEX requestor_index ON purchaseorder po
(XMLCast(XMLQuery('$p/PurchaseOrder/Requestor'
PASSING po.OBJECT_VALUE AS "p"
RETURNING CONTENT) AS VARCHAR2(128)));
- Another possible scenario is when its necessary to index heap organized/nested tables. It is not possible to index nodes thar are members of a collection that is not managed by a nested table. In order to make sure nested tables are used with O/R storage, is to ensure that storeVarrayAsTable is set to true in XSD prior to 11.2.0.2. From 11.2.0.2 onwards, the default changed to true from false.
SQL> SELECT XMLCast( XMLQuery ('$p/PurchaseOrder/Reference/text()'
passing object_value as "p" returning content)
as VARCHAR2(32)) "Reference"
from PURCHASEORDER
where XMLExists
('$p/PurchaseOrder/LineItems/LineItem/Part[@Id="717951002372"]'
passing object_value as "p")
/
Predicate Information (identified by operation id):
---------------------------------------------------
|* 4 | TABLE ACCESS FULL | LINEITEM_TABLE | 1 | 50 | 13(0)| 00:00:01 |
In order to remove the FTS, create an index on the nested table:
SQL> CREATE INDEX lineitem_part_index
2 ON lineitem_table l(l.part.part_number, l.NESTED_TABLE_ID);
OR
SQL> CREATE INDEX iPartNumber on LINEITEM_TABLE
(PART.PART_NUMBER, NESTED_TABLE_ID) compute statistics;
Then rerunning the query:
SQL> SELECT XMLCast( XMLQuery ('$p/PurchaseOrder/Reference/text()'
passing object_value as "p" returning content)
as VARCHAR2(32)) "Reference"
from PURCHASEORDER
where XMLExists
('$p/PurchaseOrder/LineItems/LineItem/Part[@Id="717951002372"]'
passing object_value as "p")
/
Predicate Information (identified by operation id):
---------------------------------------------------
4 | TABLE ACCESS BY INDEX ROWID| LINEITEM_TABLE | 1 | 50 |1 (0)| 00:00:01 |
- In general to ensure optimal path access and best performance, analyze query plans carefully, create indexes to optimize predicate evaluation, monitor index usage and drop indexes that are not contributing to query performance.
COMMON PERFORMANCE ISSUES AND POSSIBLE SOLUTIONS:
Collection iterator pickler fetch
Collection Iterator Pickler Fetch means that the XML document or fragment is handled in memory. This should be avoided because it is serialized and dealt via a Pickler Fetch routine, which cannot be optimized by Oracle. Oracle doesn't have enough information to re-write the query in a more optimal form. If a high amount of data is being processed, this will result in expensive CPU and memory usage, and may exhaust PGA resources.
XDB has two ways of evaluating an XPath or XQuery: the SQL execution path and the functional execution path. COLLECTION ITERATOR PICKLER FETCH XMLSEQUENCEFROMXMLTYPE in the explain plan for a query implies that the query is being resolved using the functional execution path rather than the SQL execution path. Therefore, the XQuery or XPath expression could not be re-written into an equivalent SQL operation.
The functional evaluation execution path has the advantage that it can handle a wider range of Query and XPath expressions than the SQL execution path. However, it is far less efficient than the SQL execution path. Functional execution can be acceptable if only a small number of documents are being processed. But for operations on large documents or a large number of documents, functional evaluation is unlikely to yield satisfactory results from a performance perspective. There are a few things that can cause re-write to fail and force a functional evaluation (query is resolved by building a DOM):
The use of wildcards in an XPath expression. Avoid using wildcards.
Using the double forward slash (//element3) notation when searching for an element with XPath. Write the full XPath expression (/element1/element2/element3).
There are repeating elements or nodes that occur more than once (collections) which are not stored in nested tables or are stored inefficiently in lobs:
- If using CLOB or Binary XML (11g onwards), creating an XMLType index could help improve performance.
- If using Object relational storage, use nested tables and index them to speed up performance.
XDB Demo Scripts
You can go through a variety of Live XDB demos from the browser that provide a simple introduction to storing, indexing, updating and querying XML document with Oracle Database:
Contacts
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Details |
Actions |
BEST PRACTICES |
TYPES OF STORAGE: |
CLOB Based Storage |
Object Relational Storage |
Binary XML Storage |
TYPES OF INDEXES: |
Function Based Indexes |
B*Tree/Bitmap Indexes |
Oracle Text Indexes |
XMLIndex |
UPDATEXML() |
OTHER GOTCHAS TO IMPROVE XML PERFORMANCE |
ANALYZING THE EXECUTION PLAN |
COMMON PERFORMANCE ISSUES AND POSSIBLE SOLUTIONS: |
Collection iterator pickler fetch |
XDB Demo Scripts |
Contacts |
References |