Slow Query

  • Hi Guys

    Aprreciate your help with this. I am trying to tune the following query: It is part of a stored proc.

    SELECT s.SupplierID, c.Carrier as Supplier FROM (SELECT DISTINCT SupplierID FROM CPT_SUBDC_Analysis UNION ALL SELECT DISTINCT SupplierID FROM CPT.dbo.CPT_SUBDC_Analysis) s
    LEFT OUTER JOIN Carrier c ON c.CarrierID = s.SupplierID
    GROUP BY s.SupplierID, c.Carrier
    ORDER BY c.Carrier
    This is currently running quite frequently. It takes on average 2 minutus to run.

    Most of the cost is with this - SELECT DISTINCT SupplierID FROM CPT_SUBDC_Analysis UNION ALL SELECT DISTINCT SupplierID FROM CPT.dbo.CPT_SUBDC_Analysis

    Is there another way to use this in the query?

    Many thanks in advance.

  • ss-457805 - Thursday, July 6, 2017 9:55 AM

    Hi Guys

    Aprreciate your help with this. I am trying to tune the following query: It is part of a stored proc.

    SELECT s.SupplierID, c.Carrier as Supplier FROM (SELECT DISTINCT SupplierID FROM CPT_SUBDC_Analysis UNION ALL SELECT DISTINCT SupplierID FROM CPT.dbo.CPT_SUBDC_Analysis) s
    LEFT OUTER JOIN Carrier c ON c.CarrierID = s.SupplierID
    GROUP BY s.SupplierID, c.Carrier
    ORDER BY c.Carrier
    This is currently running quite frequently. It takes on average 2 minutus to run.

    Most of the cost is with this - SELECT DISTINCT SupplierID FROM CPT_SUBDC_Analysis UNION ALL SELECT DISTINCT SupplierID FROM CPT.dbo.CPT_SUBDC_Analysis

    Is there another way to use this in the query?

    Many thanks in advance.

    You could try sorting the two concatenated sets only once:
    SELECT s.SupplierID, c.Carrier as Supplier
    FROM (
     SELECT DISTINCT SupplierID
     FROM (
      SELECT SupplierID
      FROM CPT_SUBDC_Analysis
      UNION ALL
      SELECT SupplierID
      FROM CPT.dbo.CPT_SUBDC_Analysis
     ) d
    ) s
    LEFT OUTER JOIN Carrier c
     ON c.CarrierID = s.SupplierID
    GROUP BY s.SupplierID, c.Carrier
    ORDER BY c.Carrier

    Also, see UNION without the ALL operator. I prefer the version above.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Why are you using DISTINCT and UNION ALL? That might be the the reason you need the GROUP BY.

    SELECT s.SupplierID,
      c.Carrier as Supplier
    FROM (SELECT SupplierID FROM CPT_SUBDC_Analysis
      UNION
      SELECT SupplierID FROM CPT.dbo.CPT_SUBDC_Analysis) s
    LEFT OUTER JOIN Carrier c ON c.CarrierID = s.SupplierID
    ORDER BY c.Carrier

    Anyway, to get better help, post DDL and execution plans as shown in the following article: How to Post Performance Problems - SQLServerCentral

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • With the DISTINCT aggregation operation in addition to the GROUP BY operation but no filtering at all, you're going to be somewhat dependent on hardware (mostly memory & disk) to make this run fast. Any chance of some type of WHERE clause? Do you really have to have ALL the data?

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Thanks Luis and Chris for your suggestions. Both your query returns in same time. It take about 1 min 15 sec. Big improvement from the orginial query.
    Grant -  we cannot use the where clause it needs all the data. No wonder it's doing an index scan on on the whole table. Please find attached the execution plan
  • ss-457805 - Thursday, July 6, 2017 2:16 PM

    Thanks Luis and Chris for your suggestions. Both your query returns in same time. It take about 1 min 15 sec. Big improvement from the orginial query.
    Grant -  we cannot use the where clause it needs all the data. No wonder it's doing an index scan on on the whole table. Please find attached the execution plan

    Thanks for the Execution Plan.
     I know this looks a little complicated, but it's extremely fast because it replaces streaming aggregates with seeks, for each supplierid in each of the two big tables.
    Useful links here.
    Give it a try:
    -- Using Paul White's superfast "DISTINCT"
    -- 1. Collect all SupplierID from dbo.CPT_SUBDC_Analysis
    -- into #temp table #CPT_SUBDC_Analysis_1
    WITH rCTE AS (
     SELECT SupplierID = MIN(T.SupplierID)
     FROM dbo.CPT_SUBDC_Analysis T
     UNION ALL
     SELECT R.SupplierID
     FROM (
      SELECT
       T.SupplierID,
       rn = ROW_NUMBER() OVER (ORDER BY T.SupplierID)
      FROM dbo.CPT_SUBDC_Analysis T
      JOIN rCTE R
       ON  R.SupplierID < T.SupplierID
     ) R
     WHERE R.rn = 1
    )
    SELECT SupplierID
    INTO #CPT_SUBDC_Analysis_1
    FROM rCTE
    OPTION (MAXRECURSION 0);

    -- 2. Collect all SupplierID from CPT.dbo.CPT_SUBDC_Analysis
    -- into #temp table #CPT_SUBDC_Analysis_2
    WITH rCTE AS (
     SELECT SupplierID = MIN(T.SupplierID)
     FROM CPT.dbo.CPT_SUBDC_Analysis T
     UNION ALL
     SELECT R.SupplierID
     FROM (
      SELECT
       T.SupplierID,
       rn = ROW_NUMBER() OVER (ORDER BY T.SupplierID)
      FROM CPT.dbo.CPT_SUBDC_Analysis T
      JOIN rCTE R
       ON  R.SupplierID < T.SupplierID
     ) R
     WHERE R.rn = 1
    )
    SELECT SupplierID
    INTO #CPT_SUBDC_Analysis_2
    FROM rCTE
    OPTION (MAXRECURSION 0);

    -- 3. Use these two temp tables for the query:
    SELECT
     s.SupplierID,
     c.Carrier as Supplier
    FROM (
     SELECT *
     FROM (
      SELECT SupplierID
      FROM #CPT_SUBDC_Analysis_1
      UNION ALL
      SELECT SupplierID
      FROM #CPT_SUBDC_Analysis_2
     ) d GROUP BY SupplierID
    ) s
    LEFT OUTER JOIN Carrier c
     ON c.CarrierID = s.SupplierID
    ORDER BY c.Carrier
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Chris. This is awesome. It's pretty instant.

    Thanks for the link as well. Added to my favourites.  🙂

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply