Perceived bug in HierarchyId SQL Server 2008

  • I am using the HierarchyId datatype in order to build somewhat complex hierarchies. I have both a depth first and breadth first index on the table with the clustered index on just the HierarchyId field. However, I'm seeing behavior that seems to be incorrect.

    Based on my understanding, a range query against the dataType in the form of WHERE path.IsDescendantOf(parent.path) = 1 should be able to do a seek against the datatype pretty easily with the clustered index by essentially performing the query in the manner WHERE path >= parent.path AND path <= parent.path.MaxDescendant ( or path < parent.path.NextSibling). This should be possible the same way any search can use an index with a BETWEEN or a "less than/but greater than" type query.

    This actually does work as expected when I pass in a variable of type HierarchyId as the parent. The query plan shows some analysis of the parent and a clustered index seek to find all of the children. However, if I use a self join in the form of

    SELECT child.path

    FROM hierarchy parent

    INNER JOIN hierarchy child on chld.IsDescendantOf(parent.path) = 1

    WHERE parent.id = 1 -- (a non-clustered index on integer id field)

    ... a full table scan is performed against the child table (100,000s of rows).

    A work around I have discovered is to execute a query in the form of:

    SELECT child.path

    FROM hierarchy parent

    INNER JOIN hierarchy child ON child.path >= parent.path

    AND child.path < parent.path.GetAncestor(1).GetDescendant(parent.PATH, NULL)

    ... which performs an index seek exactly the way I would expect it to. The syntax leaves a lot to be desired though. I am not even sure if it is 100% guaranteed to be accurate because I don't know that the GetDescendant() method with a NULL second parameter will guarantee me the next possible sibling for parent which is necessary in order for this to be accurate (else I might return children of a different sibling).

    It seems to me though, that the IsDescendantOf(parent.path) = 1 syntax is the Microsoft endorsed way of performing this query but it performs horribly in the above scenario. Also, I did notice that there is a private method DescendentLimit() on the HiearchyId column, which if I had access to would make my syntax a lot more readable. But since it is private I would hope I could get by without it.

    Has anyone else run into this problem with the type? I haven't seen any example of anybody trying to do the self-join (which is necessary for my requirements). Is there another work around? Does this seem like a bug that Microsoft should look at or am I just doing something horribly wrong?

    Any help would be appreciated.

    - tim

    Sample code:

    CREATE TABLE hierarchyTest(

    [id] [int] IDENTITY(1,1) NOT NULL,

    [idParent] [int] NULL,

    [path] [hierarchyid] NOT NULL

    CONSTRAINT [hierarchyTest_pk] PRIMARY KEY CLUSTERED (

    [path]

    )

    )

    GO

    CREATE UNIQUE NONCLUSTERED INDEX [hierarchyTest_id] ON hierarchyTest

    (

    id ASC

    )

    GO

    SET IDENTITY_INSERT hierarchyTest ON

    GO

    INSERT INTO hierarchyTest (id, idParent, path)

    VALUES(1,0, HierarchyId::Parse('/1/')),

    (44,1, HierarchyId::Parse('/1/44/')),

    (82,1, HierarchyId::Parse('/1/82/')),

    (91,44, HierarchyId::Parse('/1/44/91/')),

    (92,44, HierarchyId::Parse('/1/44/92/')),

    (93,44, HierarchyId::Parse('/1/44/93/')),

    (94,92, HierarchyId::Parse('/1/44/92/94/')),

    (95,91, HierarchyId::Parse('/1/44/91/95/')),

    (96,91, HierarchyId::Parse('/1/44/91/96/')),

    (98,93, HierarchyId::Parse('/1/44/93/98/')),

    (142,44, HierarchyId::Parse('/1/44/142/')),

    (143,74076, HierarchyId::Parse('/1/44/142/74076/143/')),

    (144,44, HierarchyId::Parse('/1/44/144/')),

    (145,144, HierarchyId::Parse('/1/44/144/145/')),

    (147,44, HierarchyId::Parse('/1/44/147/')),

    (148,44, HierarchyId::Parse('/1/44/148/')),

    (149,147, HierarchyId::Parse('/1/44/147/149/')),

    (151,148, HierarchyId::Parse('/1/44/148/151/')),

    (152,44, HierarchyId::Parse('/1/44/152/')),

    (153,152, HierarchyId::Parse('/1/44/152/153/')),

    (1140,152, HierarchyId::Parse('/1/44/152/1140/')),

    (3825,152, HierarchyId::Parse('/1/44/152/3825/')),

    (4950,44, HierarchyId::Parse('/1/44/4950/')),

    (4951,4950, HierarchyId::Parse('/1/44/4950/4951/')),

    (4952,4950, HierarchyId::Parse('/1/44/4950/4952/')),

    (5599,91, HierarchyId::Parse('/1/44/91/5599/')),

    (6257,44, HierarchyId::Parse('/1/44/6257/')),

    (6258,44, HierarchyId::Parse('/1/44/6258/')),

    (6259,6258, HierarchyId::Parse('/1/44/6258/6259/')),

    (6260,6257, HierarchyId::Parse('/1/44/6257/6260/')),

    (7595,44, HierarchyId::Parse('/1/44/7595/')),

    (7596,44, HierarchyId::Parse('/1/44/7596/')),

    (7597,7596, HierarchyId::Parse('/1/44/7596/7597/')),

    (7598,7595, HierarchyId::Parse('/1/44/7595/7598/')),

    (10454,44, HierarchyId::Parse('/1/44/10454/')),

    (13624,44, HierarchyId::Parse('/1/44/13624/')),

    (13625,13624, HierarchyId::Parse('/1/44/13624/13625/')),

    (13626,13625, HierarchyId::Parse('/1/44/13624/13625/13626/')),

    (14930,44, HierarchyId::Parse('/1/44/14930/')),

    (14931,14930, HierarchyId::Parse('/1/44/14930/14931/')),

    (18934,4950, HierarchyId::Parse('/1/44/4950/18934/')),

    (26223,44, HierarchyId::Parse('/1/44/26223/')),

    (26224,26223, HierarchyId::Parse('/1/44/26223/26224/')),

    (26569,44, HierarchyId::Parse('/1/44/26569/')),

    (26570,26569, HierarchyId::Parse('/1/44/26569/26570/')),

    (29426,44, HierarchyId::Parse('/1/44/29426/')),

    (29427,29426, HierarchyId::Parse('/1/44/29426/29427/')),

    (36499,44, HierarchyId::Parse('/1/44/36499/')),

    (36500,36499, HierarchyId::Parse('/1/44/36499/36500/')),

    (37085,52787, HierarchyId::Parse('/1/44/144/52787/37085/')),

    (40488,44, HierarchyId::Parse('/1/44/40488/')),

    (40489,40488, HierarchyId::Parse('/1/44/40488/40489/')),

    (40959,29426, HierarchyId::Parse('/1/44/29426/40959/')),

    (40961,40959, HierarchyId::Parse('/1/44/29426/40959/40961/')),

    (41230,52787, HierarchyId::Parse('/1/44/144/52787/41230/')),

    (42312,44, HierarchyId::Parse('/1/44/42312/')),

    (42426,42312, HierarchyId::Parse('/1/44/42312/42426/')),

    (43264,52787, HierarchyId::Parse('/1/44/144/52787/43264/')),

    (43266,52787, HierarchyId::Parse('/1/44/144/52787/43266/')),

    (43290,52787, HierarchyId::Parse('/1/44/144/52787/43290/')),

    (43309,52787, HierarchyId::Parse('/1/44/144/52787/43309/')),

    (43996,52787, HierarchyId::Parse('/1/44/144/52787/43996/')),

    (44603,52787, HierarchyId::Parse('/1/44/144/52787/44603/')),

    (45240,52787, HierarchyId::Parse('/1/44/144/52787/45240/')),

    (45334,52787, HierarchyId::Parse('/1/44/144/52787/45334/')),

    (45364,52787, HierarchyId::Parse('/1/44/144/52787/45364/')),

    (45695,52787, HierarchyId::Parse('/1/44/144/52787/45695/')),

    (46209,52787, HierarchyId::Parse('/1/44/144/52787/46209/')),

    (46230,52787, HierarchyId::Parse('/1/44/144/52787/46230/')),

    (46535,52787, HierarchyId::Parse('/1/44/144/52787/46535/')),

    (46566,52787, HierarchyId::Parse('/1/44/144/52787/46566/')),

    (46879,52787, HierarchyId::Parse('/1/44/144/52787/46879/')),

    (47043,52787, HierarchyId::Parse('/1/44/144/52787/47043/')),

    (47537,52787, HierarchyId::Parse('/1/44/144/52787/47537/')),

    (47671,52787, HierarchyId::Parse('/1/44/144/52787/47671/')),

    (47904,52787, HierarchyId::Parse('/1/44/144/52787/47904/')),

    (48432,52787, HierarchyId::Parse('/1/44/144/52787/48432/')),

    (49171,52787, HierarchyId::Parse('/1/44/144/52787/49171/')),

    (49231,52787, HierarchyId::Parse('/1/44/144/52787/49231/')),

    (49359,52787, HierarchyId::Parse('/1/44/144/52787/49359/')),

    (49597,13624, HierarchyId::Parse('/1/44/13624/49597/')),

    (49598,49597, HierarchyId::Parse('/1/44/13624/49597/49598/')),

    (49599,13624, HierarchyId::Parse('/1/44/13624/49599/')),

    (52241,52787, HierarchyId::Parse('/1/44/144/52787/52241/')),

    (52567,52787, HierarchyId::Parse('/1/44/144/52787/52567/')),

    (52787,144, HierarchyId::Parse('/1/44/144/52787/')),

    (52788,53136, HierarchyId::Parse('/1/44/144/53136/52788/')),

    (53136,144, HierarchyId::Parse('/1/44/144/53136/')),

    (55954,52787, HierarchyId::Parse('/1/44/144/52787/55954/')),

    (56705,53136, HierarchyId::Parse('/1/44/144/53136/56705/')),

    (56706,53136, HierarchyId::Parse('/1/44/144/53136/56706/')),

    (56707,53136, HierarchyId::Parse('/1/44/144/53136/56707/')),

    (56709,53136, HierarchyId::Parse('/1/44/144/53136/56709/')),

    (56710,53136, HierarchyId::Parse('/1/44/144/53136/56710/')),

    (57755,53136, HierarchyId::Parse('/1/44/144/53136/57755/')),

    (59642,53136, HierarchyId::Parse('/1/44/144/53136/59642/'))--,

    -- Many more removed for brevity

    GO

    SET IDENTITY_INSERT hierarchyTest OFF

    GO

    SELECT child.PATH

    FROM hierarchyTest parent

    INNER JOIN hierarchyTest child ON child.PATH.IsDescendantOf(parent.PATH) = 1

    WHERE parent.id = 59642 -- returns a single row but uses table scan

    Tim Januario

  • Seriously? No replies? I even have sample code...

    -tim

    Tim Januario

  • Hmmm....when I run your sample DDL into my SQL 2008 R2 test DB and run your select query I see index seeks (see attached sqlplan):

    SELECT child.PATH

    FROM hierarchyTest parent

    INNER JOIN hierarchyTest child ON child.PATH.IsDescendantOf(parent.PATH) = 1

    WHERE parent.id = 59642 -- returns a single row but uses table scan

    Am I missing something?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • You are getting a very different query plan than I. I noticed that your build is R2 (10.50.1765.0), we are running 10.0.4000.0 which could potentially account for the difference. I don't think it is specific to my environment though because I get the same results on every box that I try this against.

    I am attaching what I get from this.

    Tim Januario

  • @opc.three

    Actually, you helped me find the answer. The big difference, right at the top, was the use of parameters in my execution plan (which my query doesn't use). Turns out that this behavior is caused by having

    ALTER DATABASE [db] PARAMETERIZATION FORCED

    option set. I turned it off locally, and had received the better plan. Thanks for replying you've kept me from another round of head-banging.

    Tim Januario

  • I see your concern now...and I was able to recreate the plan on my instance after looking at your plan using this code:

    DECLARE @0 BIT = 1,

    @1 INT = 59642 ;

    SELECT child.PATH

    FROM hierarchyTest parent

    INNER JOIN hierarchyTest child ON child.PATH.IsDescendantOf(parent.PATH) = @0

    WHERE parent.id = @1

    GO

    SELECT child.PATH

    FROM hierarchyTest parent

    INNER JOIN hierarchyTest child ON child.PATH.IsDescendantOf(parent.PATH) = 1

    WHERE parent.id = 59642 -- returns a single row but uses table scan

    GO

    See the attached plan, notice that when using variables I get the plan with the scan you got but when using constants I get the plan with the seek.

    This one must point to an Internals difference between using variables and using constants. I have seen differences when using stored procedures versus using ad hoc SQL but this one is out of my knowledge domain so I'll be looking forward to seeing if anyone else can chime in with additional info.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply