I have some data that I want to store in the DB2 9. Should I store this data as XML schema or relational schema?

This depends on the data you want to store in the database. The main difference between relational schema and XML schema is that relational schema describes data as strongly structured and typed. XML schema describes data as loosely structured and typed. XML schema describes data order, but relational schema does not.
In general, data that has the following properties should be stored in XML:
The data is better described in hierarchal format.
The schema is constantly changing and evolving.
Many attributes of the data are empty or unknown.
You can check http://www.ibm.com/developerworks/xml/library/x-think35.html?S_TACT=105AGX06&S_CMP=EDU for advice on creating XML.

The data which I want to store in the database contains small amount of data, but have a highly complex structure. Should I store this in XML data type?

You should store such data in XML. If you store such data in relational tables, you will have complicated relational schemas, which means you require many tables. Managing these tables can have overhead. The SQL query to access such data requires joining many tables. If you have to process this data together with other data, the SQL query will be even more complicated.

Is there any way to validate the XML data against a XML schema before inserting into the database?

DB2 9 supports validating XML documents with XML schema at insert or import time. The XML schemas have to be registered in XML Schema Repository (XSR) before it can be used for validation. The XML document in the same XML type column can be validated by a different XML schema of your choice in the insert/import time. You can also choose not to validate the XML. In general, there are three choices:
Validate on the server.
Validate within an application.
Do not validate.
XML validation enables you to create complex rules that define acceptable data. You can check http://www.ibm.com/developerworks/xml/library/x-tipxsslt to use the XML Schema Standard Type Library to simplify the process of requiring formatted data such as e-mail addresses and telephone numbers.

DB2 9 supports validation on the server. You should validate incoming XML documents on the server if the incoming XML documents must be valid, but the XML documents are from an un-trusted source. For instance, suppose that a mortgage company has received an XML document of an application from a mortgage broker. All brokers develop their own applications to fill out the application forms, and to generate XML documents. Because the mortgage company has no control over these applications, and the applications might or might not follow the industrial standard XML schemas, the incoming XML documents are considered as being from an un-trusted source. The mortgage company must ensure that all XML documents are valid, and can validate all XML documents on the server side at insert and import time. The XMLVALIDATE function is used for validating XML documents with an insert statement in an application. The XMLVALIDATE function checks XML documents against the specified XML schema and makes sure that the XML document satisfies the constraints in the XML schema. When validating XML documents, the schema information passed to the XML validate function can be either explicitly passed (explicit validation) or implicitly inferred from the XML document (implicit validation). You can check http://www.ibm.com/developerworks/xml/library/x-style.html for defining data validation rules in XML schema.

What is difference between explicit validation and implicit validation?

The main difference between explicit validation and implicit validation is where the schema information (hints) are provided. For explicit validation, schema information is provided by the ACCORDING TO XMLSCHEMA clause embedded in the insert statement. For implicit validation, schema hints are provided by the inserting XML document.

Should I choose explicit validation or implicit validation to validate the data?

Well. . .It depends on the source of the XML documents, the nature of your applications, and your business rules. If the XML documents are from a trusted source and the XML document knows its schema, implicit validation can be a good choice. If the XML document’s content is unknown, explicit validation is a good choice. For explicit validation, only one exact schema can be used to validate. For implicit validation, one or more schemas can be used to validate.

Which of Explicit validation and implicit validation is better in performance?

In general, explicit validation might have better performance than implicit validation. This is not because the actual explicit validation takes less time than the actual implicit validation. It is because the implicit validation requires DB2 to search the catalog tables to find the correct schemas that match the pair value. Explicit validation does not require searching the catalog tables.

In some situations, you might want to do validation within the application. For example, suppose that you have an on-line loan application program. Customers can log on and fill out their loan application on-line forms. You have a name field, birthday field, phone number field, and other fields in the on-line form. If the customer enters information that does not satisfy the schema constraints, such as entering character data into a numeric field, you want to validate the XML document on the client/application side. If the XML document is not valid, the application can interactively ask the customer to correct the entered data until the XML document is valid.

Like any computer algorithm, validation takes resources such as CPU and memory. Imagine that there are thousands of clients that insert massive XML documents to a server and the validation is done on the one server. Sometimes, a server does not have enough resources, and the clients have the necessary bandwidth. In this case, it can be a good idea to validate on the client. You can check http://www.ibm.com/developerworks/web/library/wa-dataflow/ for performance tricks for data-intensive applications

I know that we can validate the XML data against registered XML schema. But how does DB 2 manage XML schemas in general?

I know that we can validate the XML data against registered XML schema. But how does DB 2 manage XML schemas in general?

Applications can choose to validate XML documents against an XML schema to ensure valid data is stored in the database. To validate documents, DB2 needs access to the appropriate schema information. For this reason, DB2 has an XML schema repository (XSR) that maintains a copy of the XML schemas that might be used during validation. The XSR consists of a set of new catalog tables together with commands, stored procedures and APIs to register and manage XML schemas. You can check http://www.ibm.com/developerworks/xml/library/x-style.html for defining data validation rules in XML schema.

Yes, A XML schema has to be registered in XSR before it can be used for validation.

XML schema registration involves three steps. First you have to register the primary schema document. It can be done as given below:
Step1: REGISTER XMLSCHEMA http://test FROM /db2/samples/schema/test.xsd AS test.test1;

It will register the primary schema document test.xsd with the XSR. The fully-qualified name of the test.xsd schema is /db2/samples/schema/test.xsd. /db2/samples/schema directory is the local directory where the test.xsd schema document is stored on the system. The schema location is specified using the URL http://test. This XML schema is identified as test.test1.

If XML schema involves multiple schema documents, it can be added to the registered primary document as given below:
Step2: ADD XMLSCHEMA DOCUMENT TO test.test1 ADD header.xsd FROM /db2/samples/schema/header.xsd;

Finally registration process can be completed as given below:
Step3: COMPLETE XMLSCHEMA test.test1;
You can check http://www-128.ibm.com/developerworks/wikis/download/attachments/1824/XMLSchema+Registration+and+Validation.pdf for more information on this.

An XML schema is a collection of one or more XML schema documents. Of this collection, one document must be designated as the primary schema document. The primary schema document is at the top of the hierarchy with respect to the import and includes dependencies. It includes and/or imports other documents in the schema, which might further include and/or import other schema documents.

Please give more details about query given in Step 1 for registering primary schema document? I seem to have some problems there.

In order to register the document, the fully-qualified file name for the schema document is needed. Fully-qualified name means the file system path of where the document is stored. An SQL identifier also needs to be provided so that the XML schema can later be identified and used for validation. The user can choose any valid SQL two-part name to identify the XML schema. Using meaningful names that help in identifying the XML schema is recommended. Additional information like the schema location of the XML schema can also be provided during registration. Schema location is a URL that identifies and indicates where the schema document is located. You can check http://www.ibm.com/developerworks/db2/library/techarticle/dm-0603saracco/ for more information on this.

Does DB2 validate the schema documents before registering it with XSR?

The schema documents are analyzed and any mismatches between the schema locations provided in the 2nd step and schemaLocation values provided in the import and/or include tags in the schema documents are discovered. If the schema documents are not well-formed or there are schema location mismatches, an error is issued to the user and the schema does not get registered with the XSR.

This article has been dead for over six months. Start a new discussion instead.