I was playing around with some XML lately, and had to load a file that looked like this::
I ran a simple query, one that used the OPENROWSET and a CONVERT to load the data.
WITH XmlFile (Contents) AS ( SELECT CONVERT (XML, BulkColumn) FROM OPENROWSET (BULK 'C:\mathis.xml', SINGLE_BLOB) AS XmlData ) SELECT * FROM XmlFile GO
However, that didn’t work. I received this message in SSMS.
Msg 6359, Level 16, State 1, Line 1
Parsing XML with internal subset DTDs not allowed. Use CONVERT with style option 2 to enable limited internal subset DTD support.
Hmmm. That seems to make sense. Let’s add an option to CONVERT of 2. I’ve assumed the last parameter is the one mentioned, as with date conversions, and added that. I can hover with SQL Prompt and see that.
Let’s change the code:
WITH XmlFile (Contents) AS ( SELECT CONVERT (XML, BulkColumn,2) FROM OPENROWSET (BULK 'C:\mathis.xml', SINGLE_BLOB) AS XmlData ) SELECT * FROM XmlFile GO
If you go to the BOL page for CAST and CONVERT, you will find an XML styles section. There are possible values of 0 (default), 1, 2, and 3. In this case, the 2 enables an internal DTD processing, which basically uses a default document of sorts for parsing the XML. No external DTD is needed and this is treated as a standalone document.
I am not an XML expert, but I’m guessing here that I’ve included a document that doesn’t conform to some specification and the additional style parameter allows SQL Server to ignore some of what’s there.
If anyone knows more, I would like to better understand how this works.