With the growing dimensions of data requirements, the nature of Information or ‘Data’ has taken multiple folds. The ‘Organizational data’ today cannot be expected in conventional and structural form, but it mostly exists in unstructured format. The database platform also must grow parallel to facilitate the able storage and sustain the growing developments.
Besides the database storage option, data libraries and file servers serve the unstructured data handling methods. Database management philosophies have worked well to store unstructured form of data like images, documents, or media content. Storing unstructured data in database ensures robust administration, scalable architecture, and enhanced security of data.
XML (W3C XML standard 1.0) is the acronym of eXtensible Markup Language. It is one of the fastest growing document platform which fits into variety of scenarios.
Topics
Oracle extended its strategic exploration to integrate the XML support across its infrastructure and help in building up XML-powered enterprise applications. Let us take an overview of XML DB and its capabilities.
Oracle XML DB: Overview
Oracle 9i extended its support to XML storage to meet the requirements of growing reliability over XML based documentation in various industrious sectors. Oracle XML DB is based on a hybrid model which integrates strengths of SQL and capabilities of XML. Oracle XML DB offers high performance database repository of XML documents. The features of XML DB is not limited to storage and retrieval but also it extends full support to XML parsing, access, XQuery, XSLT and XML node element based searches. The list below summarizes the capabilities of Oracle 11g XMML DB.
• In-place evolution of XML Schemas
• Oracle Partitioning of XML Schema optimized storage
• Intelligent defaults for XML Schema-optimized for an optimal storage model
• XQuery operations on Schema-Optimized storage improvements
• Support for replication of text-based XMLType storage via Oracle Streams
XML storage in database can be data centric or document centric. Data centric storage corresponds to the storage of XML data as a character type data, while document centric storage refers to the storage of XML document in the database. Oracle offers different storage models for XMLType data. In this article, we shall explore and understand the available models of XML storage in Oracle database along with the latest enhancements.
XMLType Storage Models
Selection of an appropriate storage model for XML is a crucial activity of Oracle XML DB set up. The storage architecture of the XML DB sets the tone of flexibility, scalability, and data access performance.
During its Oracle XML treaty in Oracle 9i, Oracle inducted a new native server datatype known as XMLType to store XML data in the database. The datatype works on Object Relational paradigm by storing the XML content as an LOB. A column in a database table can be declared of XMLType to store XML content in it. The datatype can also be employed for use in PL/SQL blocks, as argument datatype across subprograms and other places where a native server datatype can be. In addition, there are certain set of built in member utility functions to manipulate the XML content. The best feature of the datatype XMLType is the integration of strengths of SQL and XML. SQL functions can be operated upon the XML data and vice versa.
We shall understand the available storage models for XMLs as below.
Unstructured – The ‘Unstructured’ way of storing XML data implies to the Relational approach. It is best suited for the scenarios where XML data is document centric. As we discussed in the last section, document centric XML data is stored or fetched as a complete document i.e. as a large object. In addition, data manipulation in the document centric storage cannot be done, thus preserving the formatting and ensuring the security of the XML content. The document is stored as a CLOB in the database, thus the model is also known by the name ‘CLOB storage’. Unstructured storage model is supported by stream based replication.
This is the default storage model of XMLType. The disadvantage associated with the structured model is the performance of XML selection activities. Nevertheless, insertion of XML document would be faster, and even the retrieval of complete document. But, to search a small fragment of data in whole XML document requires explicit verification at each node and element. Index may exist using XPath but still, the data selection performance could reveal hazardous results.
Example Code [1]
The code snippet below shows the creation of a table of XMLType.
CREATE TABLE xml_demo_unstructured OF XMLTYPE xmltype STORE AS clob / TABLE created.
Now, we shall query the new LOB created in the dictionary view.
Example Code [2]
SELECT column_name, securefile FROM user_lobs WHERE table_name='XML_DEMO_UNSTRUCTURED' / COLUMN_NAME SEC --------------- --- XMLDATA NO
The query above shows that the conventional or unstructured model of XML storage offers BasicFile scheme to the incoming data..
Structured – Unlike the unstructured model, the ‘Structured’ storage model for XML implies to the object relational approach which is best suited for data centric strengths of XML. Applying the OR model, the XML structure is mapped to object relational tables. The data centric model of XML is advantageous when an excerpt from the XML has to be accessed or manipulated. In this model, XML content is bifurcated into object relational rows and columns. Now, database has more control over the XML defined nodes and elements. Thus, the queries on the XML granules yield better performance. In addition, B-tree indexes and function based index are supported in this model.
The data centric storage requires registration of an XML schema. Thereafter, all XML documents can be saved and accessed from the registered schema.
Of course, the advantage comes arrives at the cost of a disadvantage. Rebuilding the data into XML is one of the disadvantages of the ‘Structured’ model. The registered XML schema is not stored in the same format and order as the original document, thus losing its whitespace formatting.
Let us practically understand the schema generation or XML depository.
Firstly, we would register an XML schema using a sample XSD file. Note that the Schema URL provided here is not any actual existing URL. It is just a namespace which defines an element and set of attributes in the XML and ensures the incoming XML content to have same attributes. The screenshot below shows the sample XSD file used for the demonstration. Note that http://xmlns.oracle.com/xdb is an Oracle supplied XML DB namespace.
Example Code [3]
The PL/SQL block below registers a XML schema along with a namespace.
BEGIN DBMS_XMLSCHEMA.REGISTERSCHEMA ( SCHEMAURL => 'http://localhost/Oraxdb/OracleXMLSchema.xsd', SCHEMADOC => BFILENAME('XML_DIR','OracleXMLSchema.xsd'), GENTABLES => FALSE, csid => NLS_CHARSET_ID('AL32UTF8')); END; / PL/SQL PROCEDURE successfully completed.
Query the XML Schema URL from the dictionary view USER_XML_SCHEMAS.
Example Code [4]
SELECT schema_url FROM user_xml_schemas / SCHEMA_URL ------------------------------------------------ http://localhost/Oraxdb/OracleXMLSchema.xsd
Schema registration is over. Now we shall create the schema based XML Type table.
Example Code [5]
CREATE TABLE xml_demo_structured OF XMLTYPE xmlschema "http://xmlns.oracle.com/xdb/schemas/ORADEV/localhost/Oraxdb/OracleXMLSchema.xsd" ELEMENT "EmployeeDetails" / TABLE created.
Example Code [6]
DESC xml_demo_structured; Name NULL? TYPE ----------------------- -------- ---------------- TABLE OF SYS.XMLTYPE( XMLSchema "http://localhost/Oraxdb/OracleXMLSchema.xsd" Element "EmployeeDetails") STORAGE Object-relational TYPE "EmployeeDetails570_T"
Insert a sample XML Type data in the table. The sample text file Allen.txt (shown below) contains the Employee information in XML format.
Example Code [7]
INSERT INTO xml_demo_structured VALUES (xmltype(BFILENAME('XML_DIR', 'Allen.txt'), NLS_CHARSET_ID('AL32UTF8'))) / 1 ROW inserted.
During insertion, Oracle validates the provided attributes against the one defined in the namespace element during schema registration. In case of mismatch, Oracle raises exception ORA-30937.
Example Code [8]
SELECT object_value FROM xml_demo_structured / OBJECT_VALUE -------------------------------------------------------- Allen Smith 10 Salesman 12000
We shall now see working with certain XML search and query operations based on the XML fragments.
Example Code [9]
Extract all the attributes contained in a single element
SELECT EXTRACT (object_value,'/EmployeeDetails/MailAddressTo') FROM xml_demo_structured / EXTRACT(OBJECT_VALUE,'/EMPLOYEEDETAILS/MAILADDRESSTO') ------------------------------------------------------------- Allen Smith 10 Salesman 12000
Example Code [10]
Extract XML fragment of a single attribute
SELECT EXTRACT (object_value,'.//EmployeeName') FROM xml_demo_structured / EXTRACT(OBJECT_VALUE,'.//EMPLOYEENAME') -------------------------------------------------------------- Allen Smith Example Code [11]: EXTRACT VALUE OF an attribute IN an element SELECT EXTRACTVALUE (object_value,'/EmployeeDetails/MailAddressTo/EmployeeName') FROM xml_demo_structured / EXTRACTVALUE(OBJECT_VALUE,'/EMPLOYEEDETAILS/MAILADDRESSTO/EMPLOYE ------------------------------------------------------------------------ Allen Smith
Hybrid – As the name suggest, hybrid model works on ‘Best of both worlds’ principle. The actual XML storage model is not fixed for a database, but it can be randomly selected based on the XML usage. If an XML file has to be used as a document wherein no search is required on the XML data, unstructured model is preferred over structured model. If XML node based operations are required, one should go for registration of XML schema and structurally store the XML files.
Binary – We overviewed the existing XML storage models in Oracle. Nevertheless, the models are stable, but both models have their own bandwidths and limitations. It becomes difficult in a production stage to toggle over the models based on the usage of XML in the application. This thought motivated Oracle to bring in a stable storage scheme to counter the growing use case of semi structured XML in the industry.
Oracle 11g added an additional storage model to Oracle XML DB. Binary model promises a schema aware, space efficient and performance oriented storage scheme for storing XML in database. The key point which differentiates Binary model from others lies in the fact that the underlying storage for a binary XML is a BLOB and not CLOB as we saw in earlier models. Binary model also facilitates binary XML documents to utilize the strengths of SecureFile LOBs of Oracle 11g too i.e. advanced security (deduplication and encryption) and advanced compression.
The fact is deducible that the new storage model is designed to focus on the compactness of the data and ability to withstand various XML operations like node manipulation, XML extraction, and path evaluation.
The table below differentiates the three storage models. The differentiating parameters are selection of XML fragments, transactions, flexibility during schema evolution and document strengths.
Example Code [12]
Let us check out how the binary XML is stored and accessed in Oracle database.
CREATE TABLE xml_demo_binary OF xmltype XMLTYPE store AS BINARY XML / TABLE created.
The table XML_DEMO_BINARY would be able to store XML documents in binary format but not as SecureFile. To add SecureFile feature to the binary XML, table must be created with SecureFile option as below
Example Code [13]
CREATE TABLE xml_demo_secure_binary OF xmltype XMLTYPE store AS SECUREFILE BINARY XML / TABLE created.
Description of the table shows the BINARY model specification for XMLType.
Example Code [14]
DESC xml_demo_binary Name NULL? TYPE ----------------------- -------- ---------------- TABLE OF SYS.XMLTYPE STORAGE BINARY
Conclusion
I hope my effort to familiarize the readers with the XML storage models shall be fair. It must generate an idea amongst the readers as how Oracle tends to explore on storage of unstructured files in database, keeping hawk eye on performance and user operations.
The details covered in this article are just a familiarization with the XML storage philosophies in Oracle database. However, detailed studies can be carried out at Oracle documentation ( x//oracle.com/technetwork/database/features/xmldb/index.html )