Need help on query performance

  • ChrisM@Work (5/25/2016)


    Only one clustered index per table, but the index can contain more than one column.

    "Different result" is too vague, please explain.

    Thanks, added.

    Incorrect result is returned, it is not the one I wanted to get.

  • jc85 (5/25/2016)


    ChrisM@Work (5/25/2016)


    Only one clustered index per table, but the index can contain more than one column.

    "Different result" is too vague, please explain.

    Thanks, added.

    Incorrect result is returned, it is not the one I wanted to get.

    Can't you describe how it's different? Row count? Values?

    How long does it take to run with the modified clustered index?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • jc85 (5/25/2016)


    Thanks ChrisM@Work.

    The query took quite some time to complete.

    Additional info provided in case you need it.

    ann_events_Tech_Details

    [start_time] - clustered index

    [id] - Unique, non-clustered index (PK)

    ann_ReportItem

    [source] - non-clustered index

    [call_flow_name] - non-clustered index

    [report_id] - non-clustered index

    [id] - Unique, non-clustered index (PK)

    ann_ReportItem

    [source], [call_flow_name], [report_id] - clustered index

    [id] - Unique, non-clustered index (PK)

    Optimal order of columns in the clustered index may be different, depending on selectivity.

    The most selective column (smallest number of rows with identical values) must go first.

    _____________
    Code for TallyGenerator

  • ChrisM@Work (5/25/2016)


    jc85 (5/25/2016)


    ChrisM@Work (5/25/2016)


    Only one clustered index per table, but the index can contain more than one column.

    "Different result" is too vague, please explain.

    Thanks, added.

    Incorrect result is returned, it is not the one I wanted to get.

    Can't you describe how it's different? Row count? Values?

    How long does it take to run with the modified clustered index?

    The row count reduced a lot and [report_item] values are different.

    It is much faster though.

    New query output

    2016-01-01----PV-a.report_item--------1----1------0

    2016-01-01----CYGAL-a.report_item----502--2529--440

    2016-01-01----PFCC-a.report_item-----6----13-----1

    Old query output

    2016-01-01 00:00:00.000----CYGAL-CKC End Call--------0-----2------0

    2016-01-01 00:00:00.000----CYGAL-Noisy--------------18-----83-----30

    2016-01-01 00:00:00.000----CYGAL-CKC Transfer CSR---17----124----17

    2016-01-01 00:00:00.000----CYGAL-Tel - Speak to CSR--348---1792---230

  • Sergiy (5/25/2016)


    jc85 (5/25/2016)


    Thanks ChrisM@Work.

    The query took quite some time to complete.

    Additional info provided in case you need it.

    ann_events_Tech_Details

    [start_time] - clustered index

    [id] - Unique, non-clustered index (PK)

    ann_ReportItem

    [source] - non-clustered index

    [call_flow_name] - non-clustered index

    [report_id] - non-clustered index

    [id] - Unique, non-clustered index (PK)

    ann_ReportItem

    [source], [call_flow_name], [report_id] - clustered index

    [id] - Unique, non-clustered index (PK)

    Optimal order of columns in the clustered index may be different, depending on selectivity.

    The most selective column (smallest number of rows with identical values) must go first.

    Not necessarily. Filtering on [Source] followed by a date range (and having the data conveniently sorted in this order too, by the clustered index) falls within scope of this article.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • jc85 (5/25/2016)


    ChrisM@Work (5/25/2016)


    jc85 (5/25/2016)


    ChrisM@Work (5/25/2016)


    Only one clustered index per table, but the index can contain more than one column.

    "Different result" is too vague, please explain.

    Thanks, added.

    Incorrect result is returned, it is not the one I wanted to get.

    Can't you describe how it's different? Row count? Values?

    How long does it take to run with the modified clustered index?

    The row count reduced a lot and [report_item] values are different.

    It is much faster though.

    New query output

    2016-01-01----PV-a.report_item--------1----1------0

    2016-01-01----CYGAL-a.report_item----502--2529--440

    2016-01-01----PFCC-a.report_item-----6----13-----1

    Old query output

    2016-01-01 00:00:00.000----CYGAL-CKC End Call--------0-----2------0

    2016-01-01 00:00:00.000----CYGAL-Noisy--------------18-----83-----30

    2016-01-01 00:00:00.000----CYGAL-CKC Transfer CSR---17----124----17

    2016-01-01 00:00:00.000----CYGAL-Tel - Speak to CSR--348---1792---230

    Sorry my bad, missed the correlation:

    SELECT

    [date] = CAST(b.start_time AS DATE),

    [report_item] = (b.[source] + '-' + 'a.report_item'),

    [EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),

    [BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),

    [MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)

    FROM ann_events_Tech_Details b

    WHERE CAST(b.start_time AS DATE) >= @StartDate

    AND CAST(b.start_time AS DATE) <= @EndDate

    AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,

    b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)

    AND EXISTS (

    SELECT 1 FROM ann_ReportItem a

    WHERE a.[Source] = b.[Source] -- correlation

    AND a.report_id = 8

    AND a.call_flow_name = @call_flow_name)

    GROUP BY

    CAST(b.start_time AS DATE),

    b.[source]

    ORDER BY CAST(b.start_time AS DATE)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (5/25/2016)

    Sorry my bad, missed the correlation:

    SELECT

    [date] = CAST(b.start_time AS DATE),

    [report_item] = (b.[source] + '-' + 'a.report_item'),

    [EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),

    [BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),

    [MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)

    FROM ann_events_Tech_Details b

    WHERE CAST(b.start_time AS DATE) >= @StartDate

    AND CAST(b.start_time AS DATE) <= @EndDate

    AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,

    b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)

    AND EXISTS (

    SELECT 1 FROM ann_ReportItem a

    WHERE a.[Source] = b.[Source] -- correlation

    AND a.report_id = 8

    AND a.call_flow_name = @call_flow_name)

    GROUP BY

    CAST(b.start_time AS DATE),

    b.[source]

    ORDER BY CAST(b.start_time AS DATE)

    It returned the same output as previous query.

    2016-01-01----PV-a.report_item--------1----1------0

    2016-01-01 ----CYGAL-a.report_item----502--2529-- 440

    2016-01-01 ----PFCC-a.report_item -----6----13-----1

  • jc85 (5/25/2016)


    ChrisM@Work (5/25/2016)

    Sorry my bad, missed the correlation:

    SELECT

    [date] = CAST(b.start_time AS DATE),

    [report_item] = (b.[source] + '-' + 'a.report_item'),

    [EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),

    [BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),

    [MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)

    FROM ann_events_Tech_Details b

    WHERE CAST(b.start_time AS DATE) >= @StartDate

    AND CAST(b.start_time AS DATE) <= @EndDate

    AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,

    b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)

    AND EXISTS (

    SELECT 1 FROM ann_ReportItem a

    WHERE a.[Source] = b.[Source] -- correlation

    AND a.report_id = 8

    AND a.call_flow_name = @call_flow_name)

    GROUP BY

    CAST(b.start_time AS DATE),

    b.[source]

    ORDER BY CAST(b.start_time AS DATE)

    It returned the same output as previous query.

    2016-01-01----PV-a.report_item--------1----1------0

    2016-01-01 ----CYGAL-a.report_item----502--2529-- 440

    2016-01-01 ----PFCC-a.report_item -----6----13-----1

    Can you post a plan for this query please?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (5/25/2016)


    jc85 (5/25/2016)


    ChrisM@Work (5/25/2016)

    Sorry my bad, missed the correlation:

    SELECT

    [date] = CAST(b.start_time AS DATE),

    [report_item] = (b.[source] + '-' + 'a.report_item'),

    [EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),

    [BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),

    [MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)

    FROM ann_events_Tech_Details b

    WHERE CAST(b.start_time AS DATE) >= @StartDate

    AND CAST(b.start_time AS DATE) <= @EndDate

    AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,

    b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)

    AND EXISTS (

    SELECT 1 FROM ann_ReportItem a

    WHERE a.[Source] = b.[Source] -- correlation

    AND a.report_id = 8

    AND a.call_flow_name = @call_flow_name)

    GROUP BY

    CAST(b.start_time AS DATE),

    b.[source]

    ORDER BY CAST(b.start_time AS DATE)

    It returned the same output as previous query.

    2016-01-01----PV-a.report_item--------1----1------0

    2016-01-01 ----CYGAL-a.report_item----502--2529-- 440

    2016-01-01 ----PFCC-a.report_item -----6----13-----1

    Can you post a plan for this query please?

    Sure, thanks for your quick response.

  • jc85 (5/25/2016)


    How should i post the table with data on this forum?

    How many rows do you need for ann_events_Tech_Details table?

    ChrisM@Work (5/25/2016)


    Ideally you would do this by posting a script to create a table then a series of inserts to populate it with data. You don't need to use the same table name, a temp table called say #ann_ReportItem will be fine. Jumping the gun a bit, you might want to do the same with ann_events_Tech_Details.

    The simple answer to "how many rows" is "sufficient to determine correctness of the query", so I'll leave that up to you, but you should include sufficient to give you some results in each of the columns in a representative set of output rows, possibly a subset of three days' worth.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (5/25/2016)


    jc85 (5/25/2016)


    How should i post the table with data on this forum?

    How many rows do you need for ann_events_Tech_Details table?

    ChrisM@Work (5/25/2016)


    Ideally you would do this by posting a script to create a table then a series of inserts to populate it with data. You don't need to use the same table name, a temp table called say #ann_ReportItem will be fine. Jumping the gun a bit, you might want to do the same with ann_events_Tech_Details.

    The simple answer to "how many rows" is "sufficient to determine correctness of the query", so I'll leave that up to you, but you should include sufficient to give you some results in each of the columns in a representative set of output rows, possibly a subset of three days' worth.

    Provided the data in attached excel sheet, thanks!

  • How does this query perform?

    SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2

    FROM TechDetails b

    WHERE b.start_time >= @StartDate

    AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)

    GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang

    _____________
    Code for TallyGenerator

  • If that one goes not so bad then incorporate it into the bigger one:

    SELECT DT.date, (a.source + '-' + a.report_item) as report_item,

    SUM(CASE WHEN b.lang = 'EN' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) EN,

    SUM(CASE WHEN b.lang = 'BM' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) BM

    FROM ReportItem a

    INNER JOIN (

    SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2

    FROM TechDetails b

    WHERE b.start_time >= @StartDate

    AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)

    AND b.lang IN ('EN', 'BM')

    GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang

    ) DT ON a.source = b.source

    WHERE a.report_id =8 AND a.flow_name = @flow_name

    AND a.report_item IN ('Test', 'Test2')

    And from looking at the overall design I can conclude the clustered index on ReportItem must be

    (report_id, flow_name, report_item) INCLUDE (source)

    _____________
    Code for TallyGenerator

  • Sergiy (5/25/2016)


    If that one goes not so bad then incorporate it into the bigger one:

    SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2

    FROM TechDetails b

    WHERE b.start_time >= @StartDate

    AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)

    GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang

    SELECT DT.date, (a.source + '-' + a.report_item) as report_item,

    SUM(CASE WHEN b.lang = 'EN' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) EN,

    SUM(CASE WHEN b.lang = 'BM' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) BM

    FROM ReportItem a

    INNER JOIN (

    SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2

    FROM TechDetails b

    WHERE b.start_time >= @StartDate

    AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)

    AND b.lang IN ('EN', 'BM')

    GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang

    ) DT ON a.source = b.source

    WHERE a.report_id =8 AND a.flow_name = @flow_name

    AND a.report_item IN ('Test', 'Test2')

    And from looking at the overall design I can conclude the clustered index on ReportItem must be

    (report_id, flow_name, report_item) INCLUDE (source)

    The first one performed well but the second one is getting syntax error "The multi-part identifier b.lang and b.source could not be bound).

  • Sergiy (5/25/2016)


    And from looking at the overall design I can conclude the clustered index on ReportItem must be

    (report_id, flow_name, report_item) INCLUDE (source)

    Do you mean (report_id, flow_name, report_item, source)? You can't have an included column in a clustered index.

    John

Viewing 15 posts - 31 through 45 (of 76 total)

You must be logged in to reply to this topic. Login to reply