How to track changes effectively and efficiently

  • I have two questions for this post: one concerns SQL query's efficiency, one concerns data modeling.

    The scenario: I have a sales db in which each order is uniquely identified by OrderId.  And I want to track changes to some order attributes by their change dates (only the last change of any given day is retained).

    In relational DB I can use two fact tables to do the following:

    I can create an order table A (OrderId, OrderDate, FillDate) and an order history table B (ChangeDate, OrderId, Quantity).

    Where OrderDate is static and FillDate can be modified.  B contains all change history of Quantity until the FillDate such that: OrderDate <= ChangeDate <= FillDate.

    A and B have a 1 - n relation.

    Now suppose I need to produce a report: all outstanding orders as of a given date in the past.  I can write a query in one of two ways:

    query #1 using subquery :

    select * from A left join B

    on a.OrderId = b.OrderId

    and B.ChangeDate = (select max(ChangeDate) from B where OrderId = a.OrderId and ChangeDate between OrderDate and FillDate)

    where @reportDate between OrderDate and FillDate

    query #2 using derived table:

    select * from A left join ( select b.OrderId, cDate=max(ChangeDate) from a join B on a.OrderId = b.OrderId and ChangeDate between OrderDate and FillDate group by b.OrderId) c

    on a.OrderId = c.OrderId left join B on B.OrderId = c.OrderId

    and B.ChangeDate = c.cDate

    where @reportDate between OrderDate and FillDate

    Both queries return the same results.  But I do not have sufficient data to test which performs better.  so my first question is, can someone tell me categorically if one is better than the other?

    My second question is, if I want to do the same thing in a multi-dimensional db, how should I model this?  Joining two fact tables does not seem to be possible.  And if I can only work with one fact table, it appears the only solution is to denormalize the changes and create daily snapshots and repeat the same value (quantity) for each order daily until the FillDate.  But it sounds too stupid to be a viable solution.  And this is the question that has been hunting me for a long time.

    Thanks in advance!

     

  • You can use SQL Server Query analyzer to perform this comparison. While you can analyze each query, one at a time in Query Analyzer, one trick you should consider trying is to place all of the queries you want to compare in Query Analyzer, turn on "Show Execution Plan," then run all the queries at once. When you do this, the "Execution Plan" window will divide into a separate window for each query ran, allowing you to compare their execution plans much easier. In addition, in each execution plan window, you will see a "Query Cost (relative to the batch)" number. What this number is telling you is how long each query took as a percentage of the total amount of time it took all of the queries to run. The query with the lowest number is the query that performed the quickest, and so on.

  • Even better option.

    Michael,I didn't think about this but our developers use SQL Tuning, which comes from Quest software embedded in Toad (www.quest.com/sql).

    This tool is amazing it will rewrite your queries and give you the fastest option available. When our developers started using it saved me from having spend half my time fixing developers code.

    You can paste both your queries into the product and it will give you a complete mapping of what is going on.

    Have fun.

    Declan

  • Thank you, Declan, for the tip.

    Any idea or pointers for my second question?

    Michael

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply