XML Workshop XVII - Writing a LOOP to process XML elements in TSQL

  • The light finally went off...

    for each "section" I need to cross apply it to do what I want.

    Example below:

    SELECT

    x.value('UnparsedName[1]', 'VARCHAR(20)') as Name

    ,l.value('@BusinessType', 'varchar(20)') as BusinessType

    ,CAST(CONVERT(varchar(20),l.query('HighBalanceAmount/text()')) as money) as HighBalanceAmount

    FROM ScreeningXML myXML

    CROSS APPLY xmlNode.nodes('/MORTGAGEDATA/BORROWER') E(x)

    CROSS APPLY xmlNode.nodes('/MORTGAGEDATA/CREDITREPORT/MERGEDLIABILITY') M(l)

    Christopher Ford

  • Christopher Ford (3/26/2008)


    The light finally went off...

    for each "section" I need to cross apply it to do what I want.

    Example below:

    SELECT

    x.value('UnparsedName[1]', 'VARCHAR(20)') as Name

    ,l.value('@BusinessType', 'varchar(20)') as BusinessType

    ,CAST(CONVERT(varchar(20),l.query('HighBalanceAmount/text()')) as money) as HighBalanceAmount

    FROM ScreeningXML myXML

    CROSS APPLY xmlNode.nodes('/MORTGAGEDATA/BORROWER') E(x)

    CROSS APPLY xmlNode.nodes('/MORTGAGEDATA/CREDITREPORT/MERGEDLIABILITY') M(l)

    Sorry I just saw your previous question. I think I misunderstood your question though. Is borrower related to CREDITREPORT in any way? If so, it's not readily apparent (to me anyway) from your XQuery. I thought you were asking how to query multiple MERGEDLIABILITY nodes under a single CREDITRPORT node, like this:

    DECLARE @x xml;

    SET @x= '

    <CREDITREPORT CREDITREPORTID="CRRept0001" CreditReportType="Merge" MergeType="ListAndStack" EquifaxIncludedIndicator="N"

    ExperianIncludedIndicator="Y" TransUnionIncludedIndicator="N">

    <CreditReportIdentifier>xxxxx_xxx938</CreditReportIdentifier>

    <LastUpdatedDate>2006-04-10</LastUpdatedDate>

    <MERGEDLIABILITY ScoreTypeCode="Non" IndustryCode="FB" ICText="Mortgage Brokers" CollectionCode="" CollectionText=""

    MERGEDLIABILITYID="xxxxx_1_12" AccountOwnershipType="Individual" AccountStatusType="Transferred" AccountType="Mortgage"

    BusinessType="Finance" CreditLoanType="ConventionalRealEstateMortgage" CurrentDelinquencyRatingType="AsAgreed"

    ConsumerDisputeIndicator="N" DerogatoryDataIndicator="Y" BORROWERIDREFS="">

    <AccountIdentifier>******5563</AccountIdentifier>

    <AccountOpenedDate>2002-01</AccountOpenedDate>

    <AccountStatusDate>2002-12</AccountStatusDate>

    <CreditorName>OLYMPUS</CreditorName>

    <HighBalanceAmount>107200</HighBalanceAmount>

    <HighCreditAmount>107200</HighCreditAmount>

    <Late30DaysCount>0</Late30DaysCount>

    <Late60DaysCount>0</Late60DaysCount>

    <Late90DaysCount>0</Late90DaysCount>

    <MonthsReviewedCount>13</MonthsReviewedCount>

    <PaymentPatternData>NCCCCCCCCCCCC</PaymentPatternData>

    <PaymentPatternStartDate>2002-12</PaymentPatternStartDate>

    <ReportedDate>2002-12</ReportedDate>

    <TermMonths>360</TermMonths>

    <CREDITCOMMENT CommentSource="RepositoryBureau">

    <Comment>Account transferred to another lender</Comment>

    </CREDITCOMMENT>

    <CREDITCOMMENT CommentSource="RepositoryBureau">

    <Comment>BaseStatus code `05` - TRANSFER</Comment>

    </CREDITCOMMENT>

    </MERGEDLIABILITY>

    <MERGEDLIABILITY ScoreTypeCode="Non" IndustryCode="FB" ICText="Mortgage Brokers" CollectionCode="" CollectionText=""

    MERGEDLIABILITYID="yyyyy_1_13" AccountOwnershipType="Individual" AccountStatusType="Transferred" AccountType="Mortgage"

    BusinessType="Finance" CreditLoanType="ConventionalRealEstateMortgage" CurrentDelinquencyRatingType="AsAgreed"

    ConsumerDisputeIndicator="N" DerogatoryDataIndicator="Y" BORROWERIDREFS="">

    <AccountIdentifier>******5563</AccountIdentifier>

    <AccountOpenedDate>2003-01</AccountOpenedDate>

    <AccountStatusDate>2004-09</AccountStatusDate>

    <CreditorName>OLYMPUS</CreditorName>

    <HighBalanceAmount>999999</HighBalanceAmount>

    <HighCreditAmount>999999</HighCreditAmount>

    <Late30DaysCount>0</Late30DaysCount>

    <Late60DaysCount>0</Late60DaysCount>

    <Late90DaysCount>0</Late90DaysCount>

    <MonthsReviewedCount>13</MonthsReviewedCount>

    <PaymentPatternData>NCCCCCCCCCCCC</PaymentPatternData>

    <PaymentPatternStartDate>2002-12</PaymentPatternStartDate>

    <ReportedDate>2002-12</ReportedDate>

    <TermMonths>360</TermMonths>

    <CREDITCOMMENT CommentSource="RepositoryBureau">

    <Comment>Account transferred to another lender</Comment>

    </CREDITCOMMENT>

    <CREDITCOMMENT CommentSource="RepositoryBureau">

    <Comment>BaseStatus code `05` - TRANSFER</Comment>

    </CREDITCOMMENT>

    </MERGEDLIABILITY>

    </CREDITREPORT> ';

    SELECT creditreport.value('@CREDITREPORTID[1]', 'varchar(100)') AS creditreportid,

    creditreport.value('LastUpdatedDate[1]', 'datetime') AS lastupdated,

    mergedliability.value('AccountOpenedDate[1]', 'varchar(100)') AS opendate,

    mergedliability.value('AccountStatusDate[1]', 'varchar(100)') AS statusdate,

    mergedliability.value('HighBalanceAmount[1]', 'varchar(100)') AS highbalance

    FROM @x.nodes('/CREDITREPORT') AS cr(creditreport)

    CROSS APPLY creditreport.nodes('MERGEDLIABILITY') AS ml(mergedliability);

    Note that you can cross apply the results of the .nodes() method and apply another .nodes() method to them like in this example. Also on this one I spelled out the column aliases to make them a little more readable.

  • Andrew Pfeiffer (3/26/2008)


    Mike C (3/26/2008)


    For readability purposes I'd probably go with something like this:

    SELECT x.value('Name[1]', 'VARCHAR(20)')

    FROM @x.nodes('/Employees/Employee') e(x);

    I like this method. It is actually faster and uses less resources as it applies the XPathFilter.

    Thanks for the tip =)

    No problem 🙂 I was at a presentation that Michael Rys gave a while back and here are some more of his tips for making your XQuery as efficient as possible:

    - Don't use the ".." axis if you can avoid it.

    - Don't use "//" with the wildcard "*" in the middle of a path expression.

    - Use [i]n[/i] predicate filter to give the optimizer cardinality hints whenever possible (e.g., "[1]"). This forces the optimizer to make better decisions about which operators to use.

    He actually covered about 10 more things, but I can't remember them all off the top of my head. Some were for very specific situations, and probably fairly uncommon.

  • That is much simplier than the looping Xquery I used -- Thanks

  • Thank you, thank you, thank you. Written like a book. Great job.

    -Mike

  • Christopher Ford (3/26/2008)


    The light finally went off...

    for each "section" I need to cross apply it to do what I want.

    Example below:

    SELECT

    x.value('UnparsedName[1]', 'VARCHAR(20)') as Name

    ,l.value('@BusinessType', 'varchar(20)') as BusinessType

    ,CAST(CONVERT(varchar(20),l.query('HighBalanceAmount/text()')) as money) as HighBalanceAmount

    FROM ScreeningXML myXML

    CROSS APPLY xmlNode.nodes('/MORTGAGEDATA/BORROWER') E(x)

    CROSS APPLY xmlNode.nodes('/MORTGAGEDATA/CREDITREPORT/MERGEDLIABILITY') M(l)

    Right on...It does take a while for that light bulb to go off... You're essentially creating derived tables with each CROSS APPLY statement....

    It's the same argument I have with pretty much anything XQUERY/XPATH, let alone the SQL Server applications thereof: why the *&^%^$%#$ does it have to be express in such a god-awfully obtuse way? I think we need a few less Math PhD's and a few more plumbers on those committees, IMO...

    Maybe it's just my bad attitude towards it, I just can't help but think there's a better way to express what you want than this crazy syntax... Never mind that the examples just seem to SUCK.

    hehe...Must be cranky pants day...:hehe::):D:w00t::cool:;):P:sick:

    All right, Rant Off!

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (3/26/2008)

    Right on...It does take a while for that light bulb to go off... You're essentially creating derived tables with each CROSS APPLY statement....

    It's the same argument I have with pretty much anything XQUERY/XPATH, let alone the SQL Server applications thereof: why the *&^%^$%#$ does it have to be express in such a god-awfully obtuse way? I think we need a few less Math PhD's and a few more plumbers on those committees, IMO...

    Maybe it's just my bad attitude towards it, I just can't help but think there's a better way to express what you want than this crazy syntax... Never mind that the examples just seem to SUCK.

    hehe...Must be cranky pants day...:hehe::):D:w00t::cool:;):P:sick:

    All right, Rant Off!

    You're right, XQuery does have a slightly higher learning curve than many other newer features in SQL Server 🙂 The CROSS APPLY and .nodes() method combo makes more sense when you look at what's happening. Consider this .nodes() method query:

    DECLARE @x xml;

    SET @x = N'<top>

    <middle id="1">

    <low>John</low>

    </middle>

    <middle id="2">

    <low>Larry</low>

    </middle>

    </top>';

    SELECT MyNode.query('.')

    FROM @x.nodes('/top/middle') MyTable(MyNode);

    The .nodes() method in the example just returns all of the <middle> nodes as an XML data type instance, each on a row of its own (the XML data type instances that .nodes() returns have some limitations--basically you can't SELECT them directly, you need to use the XML data type methods on each instance). So the result you get back from the above is like this:

    <middle id="1"><low>John</low></middle>

    <middle id="2"><low>Larry</low></middle>

    Since .nodes() returns a tabular result set where each row is an XML data type instance you can use CROSS APPLY to apply another function or XML method to each row, like this:

    DECLARE @x xml;

    SET @x = N'<top>

    <middle id="1">

    <low>John</low>

    </middle>

    <middle id="2">

    <low>Larry</low>

    </middle>

    </top>';

    SELECT MyMiddleNode.value('@id[1]', 'int') AS Id,

    MyNameNode.value('.[1]', 'varchar(100)') AS Name

    FROM @x.nodes('/top/middle') MyMiddleTable(MyMiddleNode)

    CROSS APPLY MyMiddleNode.nodes('low') MyLowTable(MyNameNode);

    I changed the names to make it a little easier to follow, but the CROSS APPLY with the second .nodes() method just performs an XQuery on each <middle> node, retrieving the <low> node nested inside.

    Notice that since I used the CROSS APPLY, I can grab the @id attribute from MyMiddleNode without using the '..' axis.

    CROSS APPLY is actually pretty handy - try executing a function on every row of a table-valued function result set in one statement on SQL 2000. You'll end up using cursors or while loops 🙁

  • Mike C (3/26/2008)


    Matt Miller (3/26/2008)

    You're right, XQuery does have a slightly higher learning curve than many other newer features in SQL Server 🙂 The CROSS APPLY and .nodes() method combo makes more sense when you look at what's happening. Consider this .nodes() method query:

    hehe.. Now THAT's what I call an understatement. Kind of like gunshot wound being referred to as a "scratch"...

    Between XQuery itself being really nutty to work with, actually finding a decent description of what SQL 2005 did NOT implement is, well - I haven't found a comprehensive one yet. So, you end up going to W3 (since BOL recommends you to), reading what they tell you you should use, go plowing through BOL to see if this rings a bell anywhere.

    2 hours later after having given up on finding something in BOL, you're hammering on test code on your dev box...

    2 hours later - either you've chanced onto the syntax that works, or you've managed to prove to yourself that the particular XQuery method (which W3 states clearly would have done EXACTLY what you wanted) doesn't exist/doesn't work in 2005's implementation.

    CROSS APPLY is actually pretty handy - try executing a function on every row of a table-valued function result set in one statement on SQL 2000. You'll end up using cursors or while loops 🙁

    It is all in all, once you finally get some working examples of your own going.

    I was a bit disappointed at the perf issues CROSS APPLY has, and not just in regards to XQUERY. It acts and feels like the CSQ behavior in 2000, which is sad because from what I can tell, some of those performance penalties have been removed from other kinds of CSQ syntaxes in 2005.

    Still - it does beat the pants off of what would have been needed in SQL 2000.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (3/26/2008)


    Mike C (3/26/2008)


    Matt Miller (3/26/2008)

    You're right, XQuery does have a slightly higher learning curve than many other newer features in SQL Server 🙂 The CROSS APPLY and .nodes() method combo makes more sense when you look at what's happening. Consider this .nodes() method query:

    hehe.. Now THAT's what I call an understatement. Kind of like gunshot wound being referred to as a "scratch"...

    I'm known for understatement 🙂

    Between XQuery itself being really nutty to work with, actually finding a decent description of what SQL 2005 did NOT implement is, well - I haven't found a comprehensive one yet. So, you end up going to W3 (since BOL recommends you to), reading what they tell you you should use, go plowing through BOL to see if this rings a bell anywhere.

    2 hours later after having given up on finding something in BOL, you're hammering on test code on your dev box...

    2 hours later - either you've chanced onto the syntax that works, or you've managed to prove to yourself that the particular XQuery method (which W3 states clearly would have done EXACTLY what you wanted) doesn't exist/doesn't work in 2005's implementation.

    I just did that myself for about two months straight 🙂 I was researching a SQL 2008 XML book, so I had to test every single XML feature against the W3C standards - including XQuery, XDM, XML Schema, all that good stuff. A lot of the SQL 2008 XML functionality is the same as it was in SQL 2005 (with the exception of the new "let" statement, some new XDM data types, XML DML support for the xml data type, and a few other odds and ends).

    CROSS APPLY is actually pretty handy - try executing a function on every row of a table-valued function result set in one statement on SQL 2000. You'll end up using cursors or while loops 🙁

    It is all in all, once you finally get some working examples of your own going.

    I was a bit disappointed at the perf issues CROSS APPLY has, and not just in regards to XQUERY. It acts and feels like the CSQ behavior in 2000, which is sad because from what I can tell, some of those performance penalties have been removed from other kinds of CSQ syntaxes in 2005.

    Yes, performance enhancements would be welcome, but OTOH I can just imagine the complexity involved in doing what CROSS APPLY does. Hopefully they'll begin finding opportunities for optimization with CROSS APPLY in the near future. While we're filling out our wish list, I'd like to see PARTITION BY clause added to the OVER clause for aggregate functions 🙂 That would be huge 🙂

  • A very cool article. One of the best written article in this forum. Cheers!!!!!! :w00t::P:):hehe:

  • Mike C (3/26/2008)


    Yes, performance enhancements would be welcome, but OTOH I can just imagine the complexity involved in doing what CROSS APPLY does. Hopefully they'll begin finding opportunities for optimization with CROSS APPLY in the near future. While we're filling out our wish list, I'd like to see PARTITION BY clause added to the OVER clause for aggregate functions 🙂 That would be huge 🙂

    You mean THIS notation?

    sum(column) OVER (Partition by groupCol)

    Well - it mush be christmas....

    http://msdn2.microsoft.com/en-us/library/ms189461.aspx

    I'm hoping for the "running aggregates" notation to work....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (3/27/2008)


    Mike C (3/26/2008)


    Yes, performance enhancements would be welcome, but OTOH I can just imagine the complexity involved in doing what CROSS APPLY does. Hopefully they'll begin finding opportunities for optimization with CROSS APPLY in the near future. While we're filling out our wish list, I'd like to see PARTITION BY clause added to the OVER clause for aggregate functions 🙂 That would be huge 🙂

    You mean THIS notation?

    sum(column) OVER (Partition by groupCol)

    Well - it mush be christmas....

    http://msdn2.microsoft.com/en-us/library/ms189461.aspx

    I'm hoping for the "running aggregates" notation to work....

    Oops I meant ORDER BY to give you the running aggregates 🙂

    Happy holidays!

  • Mike C (3/27/2008)


    Matt Miller (3/27/2008)


    Mike C (3/26/2008)


    Yes, performance enhancements would be welcome, but OTOH I can just imagine the complexity involved in doing what CROSS APPLY does. Hopefully they'll begin finding opportunities for optimization with CROSS APPLY in the near future. While we're filling out our wish list, I'd like to see PARTITION BY clause added to the OVER clause for aggregate functions 🙂 That would be huge 🙂

    You mean THIS notation?

    sum(column) OVER (Partition by groupCol)

    Well - it mush be christmas....

    http://msdn2.microsoft.com/en-us/library/ms189461.aspx

    I'm hoping for the "running aggregates" notation to work....

    Oops I meant ORDER BY to give you the running aggregates 🙂

    Happy holidays!

    No...Matt means what he said, PARTITION BY added to the OVER Clause. Not the ORDER BY clause. 🙂

    Christopher Ford

  • EDIT:Sorry for continuing on Matt's off topic response...Last one I promise.

    ---------------------------------------------------------------------------

    EH? Wait a minute...

    Sorry for posting before thinking...sometimes my fingers are faster than my brain.

    Maybe Matt is confused. 🙂

    According to the link:

    SUM(OrderQty) OVER(PARTITION BY SalesOrderID)

    That works fine...

    Why can't I remember what it was about that, that would be really nice to have...Several people have talked about it, especially when it comes to ordered updates...

    What did you mean Matt! you're not making sense today! =)

    Christopher Ford

  • Christopher Ford (3/27/2008)


    According to the link:

    SUM(OrderQty) OVER(PARTITION BY SalesOrderID)

    That works fine...

    Why can't I remember what it was about that, that would be really nice to have...Several people have talked about it, especially when it comes to ordered updates...

    What did you mean Matt! you're not making sense today! =)

    Hehe...Put the caffeine down Chris...:) Wow!

    We're agreeing. PARTITION BY works fine...Mike C meant he wanted the ORDER BY to work in the OVER clause as well (for running totals) as part of his "wish list" of new features to have. He typoed it and said PARTITION BY (so I pointed it out to him, and he corrected himself), and thus this convoluted thread.

    As of now the SUM() OVER(PARTITION BY) is a way to get grouped totals without a GROUP BY clause. The SQL (92?) standard mentioned using this OVER notation to implement "running" aggregates as well, but that hasn't materialized yet (though there are active rumors that this might make it into 2008).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 16 through 30 (of 49 total)

You must be logged in to reply to this topic. Login to reply