Indexing Joined Columns

  • In the following query, does it make sense to create an Index covering all three of the joined columns (PersonNum, ComplianceDate,ShiftIntervalID), or a separate non-clustered index for each of these columns? The tables have a many to many relationship and no Indexes on them currently.

    SELECT

    s

    .Personnum,s.ComplianceDate,'5' AS WorkedScheduled, '0' AS WorkNotScheduled, '0' AS ScheduledNotWorked

    FROM #tmpSchedules s

    INNER JOIN #tmpTimePunch t

    ON (s.Personnum=t.personnum) AND (s.ComplianceDate=t.ComplianceDate) AND (s.ShiftIntervalID=t.ShiftIntervalID)

  • I don't think creating indexes on the temp tables will buy you anything. The whole idea of indexes is to shorten your retrieval time by spending some extra time organizing your data during inserts. I think the server will create some kind of index in memory in order to link your data for the join.

    The only reason for an index on a temp table might be if you use this table multiple times within the same SP. But I am not sure if this is going to help performance as well.

     

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • Well, I do use it as follows:

    CREATE

    TABLE #ScheduleVariance (PersonNum varchar(15),ComplianceDate DATETIME,WorkScheduled int,WorkNotScheduled int,ScheduledNotWorked int)

    --INSERT Recs with a match on PersonNum, ComplianceDate, and Interval as WorkScheduled

    INSERT INTO #ScheduleVariance (PersonNum,ComplianceDate,WorkScheduled, WorkNotScheduled, ScheduledNotWorked)

    (

    SELECT

    s

    .Personnum,s.ComplianceDate,'5' AS WorkedScheduled, '0' AS WorkNotScheduled, '0' AS ScheduledNotWorked

    FROM #tmpSchedules s

    INNER JOIN #tmpTimePunch t

    ON (s.Personnum=t.personnum) AND (s.ComplianceDate=t.ComplianceDate) AND (s.ShiftIntervalID=t.ShiftIntervalID)

    )

    --INSERT Recs from #tmpSchedules with no match on PersonNum, ComplianceDate, and ShiftIntervalID from #tmpTimePunch as ScheduledNotWorked

    INSERT INTO #ScheduleVariance (PersonNum,ComplianceDate,WorkScheduled, WorkNotScheduled, ScheduledNotWorked)

    (

    SELECT

    s

    .PersonNum,s.ComplianceDate,'0' AS WorkScheduled, '0' AS WorkNotScheduled, '5' AS ScheduledNotWorked

    FROM #tmpSchedules s

    LEFT OUTER JOIN #tmpTimePunch t

    ON (s.Personnum=t.personnum) AND (s.ComplianceDate=t.ComplianceDate) AND (s.ShiftIntervalID=t.ShiftIntervalID)

    WHERE t.PersonNum IS NULL

    )

    --INSERT Recs from #tmpTimePunch with no match on PersonNum, ComplianceDate, and Interval in #tmpSchedules as WorkNotScheduled

    INSERT INTO #ScheduleVariance (PersonNum,ComplianceDate,WorkScheduled, WorkNotScheduled, ScheduledNotWorked)

    (

    SELECT

    t

    .PersonNum,t.ComplianceDate,'0' AS WorkScheduled, '5' AS WorkNotScheduled, '0' AS ScheduledNotWorked

    FROM #tmpTimePunch t

    LEFT OUTER JOIN #tmpSchedules s

    ON (t.Personnum=s.personnum) AND (t.ComplianceDate=s.ComplianceDate) AND (t.ShiftIntervalID=s.ShiftIntervalID)

    WHERE s.PersonNum IS NULL

    )

     

    So, you're saying that Indexing Temp Tables isn't a good idea in general?

  • I would run some tests because only this will show you how one solution stacks up against another but I would be suprised if creating the indexes is going to change the performance much.

     

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • As per the recent experience i had with one of projects, After loading to production one procedure started taking more than 9 minutes to return result set. when i checked the code, developer was using table variable in a proc and he was joining this table with a permanent table which was having 5 million + rows. then i have replaced table variable with temp table and created an index on a column in temp table. After modification when i checked the same query was taking just 6 seconds to return the result set.

    Amazing difference in performance!!!!

     

  • Indexes in temp tables *DO* benefit performance *IF* the amount of data in them is large, otherwise safe your effor.


    * Noel

  • Part of the problem with table variables is they do not nor can they be made to use statistics... table variables should be kept relatively short... changing to a temp table was the best thing you could have done... adding indexes to that large a table was the second best thing. 

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • When you say the tables have "no indexes on them", I'm hoping that you at least have a primary key   If not, you really should add one.

    So far as whether to add a 3 single column indexes or a single 3 column index goes, I'd go along with the recommendation provided by the Index Tuning Wizard which will be a single 3 column index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Well, I never used temp tables that would span more then few datapages so I kind of assumed the temp tables we are talking about are small. I guess I can not think of a reason why would anyone want to create a huge temp table in the first place. Wouldn't that indicate some inherent problem with the database design?

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • Well, not really... if you have a hundred million row table and you isolate a million of those rows in a temp table for some SQL prestidigitation, for example...

    Or you use a temp table for a staging table...

    There's a lot of reasons why you might want to make a "large" temp table...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • If the Inserted Data is very huge and If u are going to use the three columns for joining purpose then it will be good idea to go for composite nocluster index.

    Now if the Inserted data is small in size , the indexing on temp table with any combination will not give performance boost to you.

     

  • Pop,

    How many rows are we talking about?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks Jeff! You're the man!

    Well, these temp tables can get to be about 100 million+ records, so maybe I'll just create physical tables in the db, populate these staging tables, index them, and then drop them from the db after i'm finished with them rather than using temp tables. Good idea?

  • Wow,

    From the field names it looks like some kind of work schedule you dealing with. Are you working for the Chinese government of something?

    If you getting 100 million+ records into the temp tables then how big are your source tables?

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • I'm comparing people's scheduling and time punch data, which often do not match. I need to define the amount of time they worked when scheduled, worked when not scheduled, and were scheduled but did not work. The most accurate way to do this seems to be to break down each person's data by 1440 1 minute intervals per day. As you can imagine, this can lead to some data explosion, thus the enormous amount of records involved.

    Given that, I wonder if it's better to at least temporarily create them physically in the db rather than temp db, and what might be the best column ordering in the index given the way they're joined?

    SELECT

    s.Personnum,s.ComplianceDate,'5' AS WorkedScheduled, '0' AS WorkNotScheduled, '0' AS ScheduledNotWorked

    FROM #tmpSchedules s

    INNER JOIN #tmpTimePunch t

    ON (s.Personnum=t.personnum) AND (s.ComplianceDate=t.ComplianceDate) AND (s.ShiftIntervalID=t.ShiftIntervalID)

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply