October 2007
Posted on the 23rd at 9:09 PM CST
Validate XML Stream Against a Schema in VB.NET
FiledFiled under VB.NET

I have been doing several different things with XML at work lately, and one of the projects required me to generate a big chunk of XML based off another companys pre-defined standards. These standards are unbelievably extensive; the documentation is a 2,000+ page PDF! All of the data used is dynamic and not in my control, so I was kinda worried about errors in the XML.  I needed a quick and easy function to validate a string of XML against any number of schemas and have it return a list of all errors that occured during the validation. This way I will not waste valuable resources by streaming out an XML file that is not even valid. I thought this would be really simple, but of course I ran into some road blocks.

The majority of the examples on the web that I found were using the XmlValidatingReader class, which at first glance appeared to be what I was looking for. However, upon trying that, Visual Studio notified me that particular class is obsolete. Back to the drawing board I went. I was able to come up with something that worked, but not exactly the way I needed it to. It was bombing out at the first validation error it detected, and I needed a list of all the validation errors.  I wasn't able to completely figure this out on my own, so I went to the MSDN forums in pursuit of some expert advice.  Thankfully Martin Honnen ended up coming to my rescue and helped me get to a perfect solution.  Enough yip-yap, here is the class I came up with...

Imports System.Collections.Generic
Imports System.IO
Imports System.Xml
Imports System.Xml.Schema

Public Class XmlValidator

    Private ErrorList As List(Of String)
    Private SchemaList As List(Of String)
    Private SchemaReaderSettings As XmlReaderSettings
    Private SchemaValidation As ValidationEventHandler

    ''' <summary>
    ''' Instantiates the XmlValidator class
    ''' </summary>
    Public Sub New()
        SchemaList = New List(Of String)
        SchemaReaderSettings = New XmlReaderSettings()
        SchemaValidation = New ValidationEventHandler(AddressOf ValidationHandler)
    End Sub

    ''' <summary>
    ''' Settings applied to the schema(s) when validating
    ''' </summary>
    Public ReadOnly Property SchemaSettings() As XmlReaderSettings
        Get
            Return SchemaReaderSettings
        End Get
    End Property

    ''' <summary>
    ''' List of absolute paths for each schema to validate against
    ''' </summary>
    Public ReadOnly Property Schemas() As List(Of String)
        Get
            Return SchemaList
        End Get
    End Property

    ''' <summary>
    ''' Validates the given XML string against the schema(s)
    ''' </summary>
    ''' <param name="RawXml">The raw XML data to validate</param>
    ''' <returns>A generic list of error messages</returns>
    Public Function ValidateXml(ByVal RawXml As String) As List(Of String)
        ErrorList = New List(Of String)

        If Me.Schemas.Count > 0 Then
            Dim ReaderSettings As New XmlReaderSettings()

            With ReaderSettings
                .ValidationType = ValidationType.Schema
                .ValidationFlags = XmlSchemaValidationFlags.ProcessSchemaLocation Or XmlSchemaValidationFlags.ReportValidationWarnings Or XmlSchemaValidationFlags.AllowXmlAttributes

                For Each SchemaPath As String In Me.Schemas
                    .Schemas.Add(Nothing, XmlReader.Create(SchemaPath, Me.SchemaSettings))
                Next

                AddHandler .ValidationEventHandler, SchemaValidation
            End With

            Using Reader As XmlReader = XmlReader.Create(New StringReader(RawXml), ReaderSettings)
                While Reader.Read()
                    ' Reads the whole file and will call the validation
                    ' handler subroutine if an error is detected.  Doing
                    ' it this way allows us to pick out ALL of the errors 
                    ' from the XML, rather than bombing out on the first
                    ' one.
                End While
            End Using
        End If

        Return ErrorList
    End Function

    Private Sub ValidationHandler(ByVal sender As Object, ByVal e As System.Xml.Schema.ValidationEventArgs)
        If e.Severity = XmlSeverityType.Error Then
            ErrorList.Add(e.Message)
        End If
    End Sub
End Class
 

The code itself is fairly self-explanatory.  But I am gonna post a quick and dirty example of how to use it anyway...

Dim Validator as New XmlValidator()
Dim Errors as List(Of String)
Dim RawXml as String

Using Reader as New StreamReader(Server.MapPath("~/App_Data/test.xml"), Encoding.UTF8)
    RawXml = Reader.ReadToEnd()
End Using

With Validator
    .SchemaSettings.ProhibitDtd = False
    .Schemas.Add(Server.MapPath("~/App_Data/schema1.xsd"))
    .Schemas.Add(Server.MapPath("~/App_Data/schema2.xsd"))

    Errors = .ValidateXml(RawXml)
End With

For Each Err as String In Errors
    Console.WriteLine(Err)
Next


Simple as that!  I know it took me too much time to reach this solution, so hopefully this will save a struggling programmer out there some grief.

Best regards...

Comments (5)
Permalink Comment from Mark on March 28th, 2008 at 9:17 AM
Josh,

Are you really getting ALL errors? I find that the XmlReader fires the ValidationEventHandler but then skips all siblings of the error element and continues at the next parent. So if your XML contains two invalid elements at the same level, only the first will be caught.

I'm looking for a solution to continue processing the siblings of an error element but have had no luck so far.

Thanks,
Mark
Permalink Comment from Josh StodolaEmail on March 28th, 2008 at 9:42 AM
Hi Mark! I guess I did not run into that scenario. I will do some testing on it later when I get some free time. I am swamped for the time being. If you happen to reach a solution, please let me know.
Permalink Comment from Matt Email on April 8th, 2008 at 4:34 AM
Hi Josh,
many thanks for posting the code and the sample provided is also appreciated. Often samples on how to call code are ommitted because things are arguably self-explanatory, but samples provide insights into how the class was intended to be used which at times can be invaluable.
kind regards
Matt
Permalink Comment from ruthness on May 30th, 2008 at 9:58 AM
Good one Josh.
thanks a lot
Permalink Comment from Jason on September 4th, 2008 at 9:24 AM
Just stumbled upon this and I have the same issue as Mark. If I have two invalid elements within a single parent, only the first invalid element is caught. Were you ever able to get it to truly catch all errors?

Guess What?

There are a few basic guidelines you should be aware of before leaving a comment…

  • If you choose to display your email address, it will not be detected by spam bots
  • Comments are limited to 3,000 characters; so far you have used none of them
  • HTML will be encoded; links and line breaks will be converted automatically
  • Comments containing five or more links will be subject to moderation

Have Your Say

← Answer this to prove you are human
 
 

Chill Out…

No Trackbacks