I have been doing several different things with XML at work lately, and one of the projects required me to generate a big chunk of XML based off another companys pre-defined standards. These standards are unbelievably extensive; the documentation is a 2,000+ page PDF! All of the data used is dynamic and not in my control, so I was kinda worried about errors in the XML. I needed a quick and easy function to validate a string of XML against any number of schemas and have it return a list of all errors that occured during the validation. This way I will not waste valuable resources by streaming out an XML file that is not even valid. I thought this would be really simple, but of course I ran into some road blocks.
The majority of the examples on the web that I found were using the XmlValidatingReader class, which at first glance appeared to be what I was looking for. However, upon trying that, Visual Studio notified me that particular class is obsolete. Back to the drawing board I went. I was able to come up with something that worked, but not exactly the way I needed it to. It was bombing out at the first validation error it detected, and I needed a list of all the validation errors. I wasn't able to completely figure this out on my own, so I went to the MSDN forums in pursuit of some expert advice. Thankfully Martin Honnen ended up coming to my rescue and helped me get to a perfect solution. Enough yip-yap, here is the class I came up with...
Imports System.Collections.Generic Imports System.IO Imports System.Xml Imports System.Xml.Schema Public Class XmlValidator Private ErrorList As List(Of String) Private SchemaList As List(Of String) Private SchemaReaderSettings As XmlReaderSettings Private SchemaValidation As ValidationEventHandler ''' <summary> ''' Instantiates the XmlValidator class ''' </summary> Public Sub New() SchemaList = New List(Of String) SchemaReaderSettings = New XmlReaderSettings() SchemaValidation = New ValidationEventHandler(AddressOf ValidationHandler) End Sub ''' <summary> ''' Settings applied to the schema(s) when validating ''' </summary> Public ReadOnly Property SchemaSettings() As XmlReaderSettings Get Return SchemaReaderSettings End Get End Property ''' <summary> ''' List of absolute paths for each schema to validate against ''' </summary> Public ReadOnly Property Schemas() As List(Of String) Get Return SchemaList End Get End Property ''' <summary> ''' Validates the given XML string against the schema(s) ''' </summary> ''' <param name="RawXml">The raw XML data to validate</param> ''' <returns>A generic list of error messages</returns> Public Function ValidateXml(ByVal RawXml As String) As List(Of String) ErrorList = New List(Of String) If Me.Schemas.Count > 0 Then Dim ReaderSettings As New XmlReaderSettings() With ReaderSettings .ValidationType = ValidationType.Schema .ValidationFlags = XmlSchemaValidationFlags.ProcessSchemaLocation Or XmlSchemaValidationFlags.ReportValidationWarnings Or XmlSchemaValidationFlags.AllowXmlAttributes For Each SchemaPath As String In Me.Schemas .Schemas.Add(Nothing, XmlReader.Create(SchemaPath, Me.SchemaSettings)) Next AddHandler .ValidationEventHandler, SchemaValidation End With Using Reader As XmlReader = XmlReader.Create(New StringReader(RawXml), ReaderSettings) While Reader.Read() ' Reads the whole file and will call the validation ' handler subroutine if an error is detected. Doing ' it this way allows us to pick out ALL of the errors ' from the XML, rather than bombing out on the first ' one. End While End Using End If Return ErrorList End Function Private Sub ValidationHandler(ByVal sender As Object, ByVal e As System.Xml.Schema.ValidationEventArgs) If e.Severity = XmlSeverityType.Error Then ErrorList.Add(e.Message) End If End Sub End Class
The code itself is fairly self-explanatory. But I am gonna post a quick and dirty example of how to use it anyway...
Dim Validator as New XmlValidator() Dim Errors as List(Of String) Dim RawXml as String Using Reader as New StreamReader(Server.MapPath("~/App_Data/test.xml"), Encoding.UTF8) RawXml = Reader.ReadToEnd() End Using With Validator .SchemaSettings.ProhibitDtd = False .Schemas.Add(Server.MapPath("~/App_Data/schema1.xsd")) .Schemas.Add(Server.MapPath("~/App_Data/schema2.xsd")) Errors = .ValidateXml(RawXml) End With For Each Err as String In Errors Console.WriteLine(Err) Next
Simple as that! I know it took me too much time to reach this solution, so hopefully this will save a struggling programmer out there some grief.
Best regards...
Are you really getting ALL errors? I find that the XmlReader fires the ValidationEventHandler but then skips all siblings of the error element and continues at the next parent. So if your XML contains two invalid elements at the same level, only the first will be caught.
I'm looking for a solution to continue processing the siblings of an error element but have had no luck so far.
Thanks,
Mark
many thanks for posting the code and the sample provided is also appreciated. Often samples on how to call code are ommitted because things are arguably self-explanatory, but samples provide insights into how the class was intended to be used which at times can be invaluable.
kind regards
Matt
thanks a lot
Guess What?
There are a few basic guidelines you should be aware of before leaving a comment…
- If you choose to display your email address, it will not be detected by spam bots
- Comments are limited to 3,000 characters; so far you have used none of them
- HTML will be encoded; links and line breaks will be converted automatically
- Comments containing five or more links will be subject to moderation
