How can I improve this SQL query

  • Hi,

    I have this SQL query that can take too long time, up to 1 minute if table contains over 1 million rows. And if the system is very active while executing this query it can cause more delays I guess.

    select

    distinct 'CONV 1' as Conveyour,

    info as Error,

    (select top 1 substring(timecreated, 0, 7) from log b where a.info = b.info order by timecreated asc) as Date,

    (select count(*) from log b where b.info = a.info) as 'Times occured'

    from log a where loggroup = 'CSCNV' and logtype = 4

    The table name is LOG, and I retrieve 4 columns: Conveyour, Error, Date and Times occured. The point of the subqueries is to count all distinct post and to retrieve the date of the first time the pst was logged. Also, a first and last date could be specified but is left out here.

    Does anyone knows how I can imporove this SQL query

    Best /M

  • Please will you post DDL for your log table so that we can see, for example, what data type your timecreated column is.  It's difficult to see why you're using string functions to manipulate date/time data.  It may also help to post an execution plan for the query, since you may be missing an index or your statistics may be out of date.

    Thanks

    John

  • I've probably missed the point entirely but does this do it:

    select

    distinct 'CONV 1' as Conveyour,

    info as error,

    min (substring(timecreated, 0, 7)) as date,

    count(*) from 'Times Occured'

    from log

    where loggroup = 'CSCNV' and logtype = 4

    group by info

     

    Jez

  • No, he is looking for the definition of the table(s) you are querying, something like this:

    CREATE TABE myTable (

    Col1 varchar(10),

    Col2 datetime not null)

    etc.  If you don't have the table definition handy use Enterprise manager to script the table.

    James.

  • I was replying to the original post in that doesn't a min()/count(*) with a group by give the same results and would be faster than the original query.

     

    Jez

  • Jez: Sorry, I confused you with the original poster

    James.

  • Jes - did you mean to type "from 'Times Occured'" or "as 'Times Occured'"?

    I agree that grouping may buy some improvement, but we really need more information from the poster to determine whether there are problems within the underlying data. As a general rule, converting dates to strings and then back to dates USUALLY are unnecessary and wasteful conversions. If the dates can be maintained purely as dates, that may end up providing the biggest bang for the buck. So could updating statistics if they are very out of date and not maintained properly.... so many possibilities.

  • select

    distinct 'CONV 1' as Conveyour,

    info as Error,

    (select top 1 substring(timecreated, 0, 7) from log b where a.info = b.info order by timecreated asc) as Date,

    (select count(*) from log b where b.info = a.info) as 'Times occured'

    from log a where loggroup = 'CSCNV' and logtype = 4

    Even with out seeing table/index defs and sample data and data distributions (which is critical to optimization, btw), I can say that this will be a poorly performing query.

    1) DISTINCT causes a sort and scan --> inefficient.

    2) timecreated is some form of char --> inefficient sorting (to do the order by)

    3) it is inefficient to inline selects in the output of another select. This is made even worse because you have an order by in one and a count in the other.

    4) the inlined count joins a large table to itself

    5) I will also venture a guess that indexes may not even help much here if your filters will return a large percentage of the rows. Unless you create a covering index (that last was added just for Jeff Moden ). Such index would prolly be pretty fat and costly to maintain though based on your field names and their likely large varchar datatypes.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • If I understand your original query correctly you really add a lot of extra work with the subquerys and distinct to accomplish this query

    SELECT

     'CONV 1' AS Conveyour,

     info AS Error,

     SUBSTRING(MIN(timecreated), 0, 7) AS [Date],

     COUNT(*) AS 'Times occured'

    FROM

     dbo.[log]

    WHERE

     loggroup = 'CSCNV' and

     logtype = 4

    GROUP BY

     info

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply