How to remove infinite loops in hierarchical data without using cursor?

  • lxz20 (6/4/2009)


    Yep, NodeIDs are repeated here as examples to show infinite loops in 0, 1, or 2 steps. These loops are actually easier to catch and removed. There are other instances where a node is NOT looped back to itself until like 7 or 10 steps later. I would like a non-cursor way to list / remove all those n step loops, if possible, for performance reason. Currently I have to use a cursor, going through row by row following the following pseudo-code. The performance is NOT good because I have to use 2 loops:

    Put all data into a cursor

    Begin cursor loop

    Fetch a row into a temp table

    Inner While Loop (while there is a new parent)

    Follow the chain to get a new parent row

    If parentNodeID = nodeID in temp table

    Infinite Loop -- do something here

    Break Inner While Loop

    Else

    Insert parent row into the same temp table

    End Inner While Loop

    End corsor loop

    Try this

    WITH UniqueNodeItem(NodeID, ParentNodeID) AS

    (

    SELECT NodeID,MAX(ParentNodeID)

    FROM NodeItem

    GROUP BY NodeID

    HAVING COUNT(*)=1),

    HierarchyTree (NodeID, ParentNodeID, Step) AS

    (

    SELECT NodeID, ParentNodeID, 1

    FROM UniqueNodeItem

    WHERE ParentNodeID NOT IN (SELECT NodeID FROM UniqueNodeItem) OR ParentNodeID IS NULL

    UNION ALL

    SELECT r.NodeID, r.ParentNodeID, t.Step + 1

    FROM UniqueNodeItem r

    inner join HierarchyTree t on r.ParentNodeID = t.NodeID

    )

    SELECT NodeID

    FROM NodeItem

    WHERE NodeID NOT IN (SELECT NodeID FROM HierarchyTree)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Mark,

    Excellent solution. It works!

  • One thing I always recommend on this kind of hierarchy, is have a "TopNodeID" type column. All levels in the same hierarchy have the same entry in that column. Speeds them up immensely. Makes it easy to prevent/fix these type of corruption issues too, since you can join on the TopNodeID column and find the loops right there. But since you don't have that, it doesn't help in this situation.

    What we did in our one and only hierarchichal-type of table (a stock exchange sector/index hierarchy with lots of roots per stock exchange) is that we have a separate table that identifies the root nodes. On top of that the INSERT/UPDATE Triggers prevent this kind of situation ever occurring. Yeah, triggers are a pain in the butt, they are however bloody helpful in this case of problem.

    Won't help with your existing data mess though...

    --------------------------------------------------------------------------
    A little knowledge is a dangerous thing (Alexander Pope)
    In order for us to help you as efficiently as possible, please read this before posting (courtesy of Jeff Moden)[/url]

  • GSquared (6/4/2009)


    You could potentially avoid a cursor in this. But it'll take more work and be less efficient.

    Cursors do have their uses. Limited, yes, but uses nonetheless.

    Especially in a case like this. A once-off clean-up where you actually really have to think procedurally in order to wrap your head around the solution for solving a problem isn't something that needs to concern itself with performance issues. Unless of course the hierarchies regularly get stuffed up. But then the issue is elsewhere, i.e. you must find the culprit and fix that one. Not meaning that you shouldn't put safeguards in place to protect your data.

    --------------------------------------------------------------------------
    A little knowledge is a dangerous thing (Alexander Pope)
    In order for us to help you as efficiently as possible, please read this before posting (courtesy of Jeff Moden)[/url]

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply