Sunday, 5 August 2012

Validating XML file against XSD with SSIS - part I

Case
I have an XML file which I want to validate before processing it. How can I validate an XML file against an XSD file with SSIS?

Solution
There are two solutions in SSIS for validating an XML file.
A) XML Task, but the big downside is that it can't validate against XSD files with import or include tags. See(/vote for) this Microsoft Connect feedback.
B) Script Task, but that requires some .Net knowledge.

In this post I will elaborate the XML Task solution and in the next post I will workout the Script Task solution.

XML Task
For this example I use two files: movies.xml and movies.xsd
<?xml version="1.0"?>
<movies>
 <movie id="123">
  <title>Pulp Fiction</title>
  <year>1994</year>
  <rating>8.9</rating>
 </movie>
 <movie id="124">
  <title>Memento</title>
  <year>2000</year>
  <rating>8.6</rating>
 </movie>
 <movie id="124">
  <title>The Usual Suspects</title>
  <year>2000</year>
  <rating>8.6</rating>
 </movie>
</movies>

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:simpleType name="LimitedString"> 
    <xs:restriction base="xs:string"> 
      <xs:maxLength value="50" /> 
    </xs:restriction> 
  </xs:simpleType> 

  <xs:element name="movies">
    <xs:complexType>
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" name="movie">
          <xs:complexType>
            <xs:sequence>
              <xs:element minOccurs="0" name="title" type="LimitedString" />
              <xs:element minOccurs="0" name="year" type="xs:unsignedShort" />
              <xs:element minOccurs="0" name="rating" type="xs:decimal" />
            </xs:sequence>
            <xs:attribute name="id" type="xs:unsignedByte" use="optional" />
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

1) New File Connection
I will use Connection Managers in this example to make things easier to configure. Feel free to use something else such as variables. Right click in the Connection Managers pane and select "New File Connection...".
New File Connection for XML and XSD file






















Select the xml file in the File Connection Manager Editor and repeat these steps for the XSD file so that you have two Connection Managers named movies.xml and movies.xsd
File Connection Manager Editor













2) XML Task
Add an XML Task to your Control Flow and give it a suitable name. You can connect it to your Data Flow Task that processes the XML file.
XML Task






















3) XML Task editor
Edit the XML Task and select the XML Connection Manager as input and the XSD Connection Manager as Second Operand. OperationType should be Validate and ValidationType should be XSD.
Set XML Task editor for validation






















4) Testing
Run the package. Change something in the XML file to make it fail (a movie title longer than 50 chars).
The result

















Note: If you try to validate against an XSD with an include or import, you will get an error that certain parts are not declared. The next post will be about validating an XML file against an XSD file with a Script Task.

No comments:

Post a Comment

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.