help with rewriting query (performance optimization)

  • I am working on a project and need to optimize the query below. It's taking 30 or so seconds to run. I am not a sql expert and am looking for any tips some of the pros here might have looking at the design of the query below - perhaps there are obvious better ways of designing to enhance performance. I know it's a bit lengthy and complex - so it may be hard to provide simple guidance .. but I thought I'd give it a shot here and see if someone can at least point me in a direction of how to rewrite or reapproach. THX.

    Danielle

    ---------------

    SELECT  ST.student_id, ST.last_name + ',  ' + ST.first_name + '(' + ST.grade + ')'

    AS Name, R.program_provider_activity_id, PPA.program_activity_name

    + '                                                    Currently Reg : '

    + (CASE WHEN (R.registration_is_active = 1)

    THEN 'Yes' ELSE ' No' END) AS program_activity_name, R.program_id,

    (SELECT     SUM(datediff(hh, s.session_start_time, s.session_end_time))

           FROM          session S

        WHERE      s.session_id IN

        (SELECT     a.[session_id]

        FROM          [Attendance] A

        WHERE      a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN

      (SELECT     [session_id]

       FROM          [Session]

     WHERE      [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))

    AS TotalHours,

    (SELECT     COUNT(DISTINCT (session_start_time))

      FROM          session S

    WHERE      s.session_id IN

     (SELECT     a.[session_id]

    FROM          [Attendance] A

     WHERE      a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN

     (SELECT     [session_id]

    FROM          [Session]

     WHERE      [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))

     AS TotalDays,

     (SELECT     COUNT(DISTINCT (datepart(ww, session_start_time)))

      FROM          session S

      WHERE      s.session_id IN

    (SELECT     a.[session_id]

     FROM          [Attendance] A

     WHERE      a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN

    (SELECT     [session_id]

      FROM          [Session]

       WHERE      [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))

       AS TotalWeeks

    FROM         dbo.ProgramProviderActivity PPA INNER JOIN

     dbo.Registration R INNER JOIN

    dbo.Student ST ON ST.student_id = R.student_id ON PPA.program_provider_activity_id =

    R.program_provider_activity_id

    ORDER BY ST.last_name + ' ' + ST.first_name, ST.grade

    --------------------------

  • For starters the subqueries you're using for TotalHours, TotalDays and TotalWeeks

    (SELECT     COUNT(DISTINCT (datepart(ww, session_start_time)))
      FROM          session S
      WHERE      s.session_id IN
    (SELECT     a.[session_id]
     FROM          [Attendance] A
     WHERE      a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN
    (SELECT     [session_id]
      FROM          [Session]
       WHERE      [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))
    

    are practically the same. This means it's reading through those tables multiple times. I'd see about making that a single query and putting it in the from clause as a derived table instead. You might get a better execution plan that way.

    --------------------
    Colt 45 - the original point and click interface

  • I'm not quite sure about what all is in your tables.. but Phill is correct.. avoid the correlated subqueries.. go for derived tables instead.

    Sometimes (for debugging), it's helpful to just create a #temp table showing the TOTALS you wish to see, grouped by Student, Provider, etc.. then you can JOIN to that #temp table.

    To use a derived table in the JOIN try this:  (you need to fix the rest of the JOIN since I don't know how you're dealing with SessionID, but this should get you on the right track...

    SELECT 

     st.student_id,

     Name = st.last_name + ',  ' + st.first_name + '(' + st.grade + ')' ,

     r.program_provider_activity_id,

     program_activity_name = ppa.program_activity_name + ' Currently Reg : ' + (CASE WHEN r.registration_is_active = 1

                              THEN 'Yes' ELSE ' No' END),

     r.program_id,

     TotalHours = dt.TotalHours,

     TotalDays = dt.TotalDays,

     TotalWeeks = dt.TotalWeeks

    FROM dbo.ProgramProviderActivity ppa

    INNER JOIN dbo.Registration r

     ON r.program_provider_activity_id = ppa.program_provider_activity_id

    INNER JOIN dbo.Student st

     ON st.student_id = r.student_id

    INNER JOIN (SELECT

          a.Session_ID,

          a.Student_ID,

          s.Program_Provider_Activity_ID,

          s.Program_ID,    

          Hrs = SUM(DATEDIFF(hh, s.session_start_time, s.session_end_time)),

          Days = COUNT(DISTINCT(session_start_time)),

          Weeks = COUNT(DISTINCT(DATEPART(ww, session_start_time)))

        FROM Session s

        INNER JOIN Attendance a

         ON a.session_id = s.session_id

        WHERE a.Attendence_Status = 1

        GROUP BY a.Session_ID, a.Student_ID, s.Program_Provider_Activity_ID, s.Program_ID ) dt

    ON dt.StudentID = r.student_id

     AND dt.SessionID = ??

     AND dt.Program_Provider_Activity_ID = ppa.Program_Provider_Activity_ID

     AND ...

     AND ...

    Consider leaving the ORDER BY out.. it only causes delays...

    hope this helps..

    Mark Gelatt

  • Sorry 'bout that.. I changed the name of the fields while I was writing.. you should use

    TotalHours = dt.Hrs,

    etc.. etc.. 

    Just see what I called the columns in the derived table. 

  • Barring me having botched order here since I don't have anything to test against this should be pretty colse to what you are after.

    SELECT

     ST.student_id,

     ST.last_name + ',  ' + ST.first_name + '(' + ST.grade + ')' AS [Name],

     R.program_provider_activity_id, PPA.program_activity_name + '                                                    Currently Reg : ' + (CASE WHEN (R.registration_is_active = 1) THEN 'Yes' ELSE ' No' END) AS program_activity_name,

     R.program_id,

     SUM(datediff(hh, s.session_start_time, s.session_end_time)) TotalHours,

     COUNT(DISTINCT (session_start_time)) TotalDays,

     COUNT(DISTINCT (datepart(ww, session_start_time))) TotalWeeks

    FROM

     dbo.ProgramProviderActivity PPA

    INNER JOIN

     dbo.Registration R

    INNER JOIN

     dbo.Student ST

    ON

     ST.student_id = R.student_id

    ON

     PPA.program_provider_activity_id = R.program_provider_activity_id

    LEFT JOIN

     dbo.Attendance A

     INNER JOIN

      dbo.Session S

     ON

      S.program_provider_activity_id = R.program_provider_activity_id AND

      S.program_id = R.program_id AND

      S.Session_id = A.Session_id

    ON

     A.attendence_status = 1 AND

     A.student_id = ST.student_id

    GROUP BY

     ST.student_id,

     ST.last_name + ',  ' + ST.first_name + '(' + ST.grade + ')',

     R.program_provider_activity_id, PPA.program_activity_name + '                                                    Currently Reg : ' + (CASE WHEN (R.registration_is_active = 1) THEN 'Yes' ELSE ' No' END),

     R.program_id

    -- Noter may have to do some work still to get Order By right because of Group By

    ORDER BY

     ST.last_name + ' ' + ST.first_name,

     ST.grade

  • Thank you all very much - I will try to implement these revisions and report back. Much appreciated.

    Danielle

  • You also should run Profiler and then a trace with this query. After that use Index Wizard to see if it finds any new indexes, etc.

    Quand on parle du loup, on en voit la queue

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply