Compare two XML variables

  • Hello,

    I'm rewriting a huge FOR XML EXPLICIT procedure to use FOR XML PATH, and need to compare previous output to the refactored one, so i didn't mess up XML structure.

    The thing is, i'm not sure that SQL Server will always generate exactly same xml **string**, so i'd rather not compare by:

    WHERE CAST(@xml_old AS NVARCHAR(MAX)) = CAST(@xml_new AS NVARCHAR(MAX))

    nor do i want to manually validate every node, since the generated xml-structure is quite complex.

    Does anyone have a way to compare xmls by their "semantic value" ?

  • Quick thought, use either HASHBYTES or CHECKSUM

    😎

    WHERE CHECKSUM(@xml_old) = CHECKSUM(@xml_new)

  • Use a compare xml tool, for example http://www.altova.com/diffdog/xml-diff.html

  • This is part of SQL Unit-tests, so need a TSQL solution :/

    HASHBYTES will give different results for following two xmls:

    <xml>

    <attribute1>1/<attribute1>

    <attribute2>2/<attribute2>

    </xml>

    <xml>

    <attribute2>2/<attribute2>

    <attribute1>1/<attribute1>

    </xml>

  • siggemannen (3/19/2015)


    This is part of SQL Unit-tests, so need a TSQL solution :/

    HASHBYTES will give different results for following two xmls:

    <xml>

    <attribute1>1/<attribute1>

    <attribute2>2/<attribute2>

    </xml>

    <xml>

    <attribute2>2/<attribute2>

    <attribute1>1/<attribute1>

    </xml>

    First, it's malformed xml 🙂 And it's really different in a sence.

    declare @t table

    (col xml);

    insert @t(col) values ('<xml>

    <attribute1>1</attribute1>

    <attribute2>2</attribute2>

    </xml>')

    ,( '<xml>

    <attribute2>2</attribute2>

    <attribute1>1</attribute1>

    </xml>');

    select col.value('(/xml/*)[1]','varchar(100)') from @t;

    You need to specify what is considered equal for the task.

    P.S. May by this function may help http://beyondrelational.com/modules/2/blogs/28/posts/10495/xquery-lab-58-select-from-xml.aspx Never used it to compare xmls, why not.

    declare @1 xml ='<xml>

    <attribute2>2</attribute2>

    <attribute1>1</attribute1>

    </xml>'

    declare @2 xml ='<xml>

    <attribute1>1</attribute1>

    <attribute2>2</attribute2>

    </xml>'

    (SELECT ParentName,ParentPosition,Depth,NodeName,Position,NodeType,FullPath,XPath,TreeView,Value

    FROM XMLTable(@1)

    EXCEPT

    SELECT ParentName,ParentPosition,Depth,NodeName,Position,NodeType,FullPath,XPath,TreeView,Value

    FROM XMLTable(@2))

    UNION ALL

    (SELECT ParentName,ParentPosition,Depth,NodeName,Position,NodeType,FullPath,XPath,TreeView,Value

    FROM XMLTable(@2)

    EXCEPT

    SELECT ParentName,ParentPosition,Depth,NodeName,Position,NodeType,FullPath,XPath,TreeView,Value

    FROM XMLTable(@1))

    Choose which columns produced by the function to compare.

  • siggemannen (3/19/2015)


    This is part of SQL Unit-tests, so need a TSQL solution :/

    HASHBYTES will give different results for following two xmls:

    <xml>

    <attribute1>1/<attribute1>

    <attribute2>2/<attribute2>

    </xml>

    <xml>

    <attribute2>2/<attribute2>

    <attribute1>1/<attribute1>

    </xml>

    Keep in mind that in the definition of XML, those are NOT equal. In "strict" XML (per W3C), XML is consider to be a document standard which has a physicial order and that order is significant. So if you're converting SQL table content into XML, you're converting from a format that has no natural physical order (tables) to one that expects one. This unfortunately means you'd be expected to enforce a predictable order if you want there to be consistency.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks for the replies everyone! That XML-function is really really great.

    Didn't know that XML was actually order-"specific", I remember ridiculing some systems myself that required to send XML tags in specific order, guess the joke was on me after all 🙂

  • Matt Miller (#4) (3/19/2015)


    Keep in mind that in the definition of XML, those are NOT equal. In "strict" XML (per W3C), XML is consider to be a document standard which has a physicial order and that order is significant.

    I know this has been argued a lot, but the XML specification found here http://www.w3.org/TR/xml11/[/url] only states that order of elements is important for an element that has an element type declaration which defines the element as having element content and contains a sequence definition.

    It is not a requirement for xml to contain such declarations.

    See sections 3.2 and 3.2.1 for the relevant definitions.

    3.2 Element Type Declarations

    The element structure of an XML document [highlight="#ffff11"]may[/highlight], for validation purposes, be constrained using element type and attribute-list declarations. An element type declaration constrains the element's content.

    Element type declarations often constrain which element types can appear as children of the element. At user option, an XML processor may issue a warning when a declaration mentions an element type for which no declaration is provided, but this is not an error.

    [Definition: An element type declaration takes the form:]

    Element Type Declaration

    [45] elementdecl ::= '<!ELEMENT' S Name S contentspec S? '>'[VC: Unique Element Type Declaration]

    [46] contentspec ::= 'EMPTY' | 'ANY' | Mixed | children

    where the Name gives the element type being declared.

    Validity constraint: Unique Element Type Declaration

    An element type must not be declared more than once.

    Examples of element type declarations:

    <!ELEMENT br EMPTY>

    <!ELEMENT p (#PCDATA|emph)* >

    <!ELEMENT %name.para; %content.para; >

    <!ELEMENT container ANY>

    3.2.1 Element Content

    [Definition: An element type has element content when elements of that type must contain only child elements (no character data), optionally separated by white space (characters matching the nonterminal S).] [Definition: In this case, the constraint includes a content model, a simple grammar governing the allowed types of the child elements and the order in which they are allowed to appear.] The grammar is built on content particles (cps), which consist of names, choice lists of content particles, or sequence lists of content particles:

    Element-content Models

    [47] children ::= (choice | seq) ('?' | '*' | '+')?

    [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?

    [49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'[VC: Proper Group/PE Nesting]

    [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'[VC: Proper Group/PE Nesting]

    where each Name is the type of an element which may appear as a child. Any content particle in a choice list may appear in the element content at the location where the choice list appears in the grammar; [highlight="#ffff11"]content particles occurring in a sequence list must each appear in the element content in the order given in the list[/highlight].

    Obviously, some consumers are written to expect ordered elements, but this is not part of the standard of an XML document.

    Usually these statements of required order come from the XML-infoset standards, which are designed to enable specifications about the structure of individual XML documents to be defined, they are not a definition of the XML document standards.

    Abstract

    This specification provides a set of definitions for use in other specifications that need to refer to the information in an XML document.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (3/19/2015)


    Matt Miller (#4) (3/19/2015)


    Keep in mind that in the definition of XML, those are NOT equal. In "strict" XML (per W3C), XML is consider to be a document standard which has a physicial order and that order is significant.

    I know this has been argued a lot, but the XML specification found here http://www.w3.org/TR/xml11/[/url] only states that order of elements is important for an element that has an element type declaration which defines the element as having element content and contains a sequence definition.

    It is not a requirement for xml to contain such declarations.

    This specification provides a set of definitions for use in other specifications that need to refer to the information in an XML document.

    Ok - that's fair. I mostly deal with typed XML (everything we do is based on a well defined schema), so I didn't appropriately describe un-typed XML. It does however apply in a fair amount of common usages.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Viewing 9 posts - 1 through 8 (of 8 total)

    You must be logged in to reply to this topic. Login to reply