Effective Dating Query

  • I currently have the following query to get historical information :

    select emp.first_name, emp.last_name,

    udf_getHistory(tran.TranDte, emp.employeeid, 1) as Address1,

    udf_getHistory(tran.TranDte, emp.employeeId, 2) as Address2,

    udf_getHistory(tran.TranDte, emp.employeeId, 3) as City,

    udf_getHistory(tran.TranDte, emp.employeeId, 4) as State,

    udf_getHistory(tran.TranDte, emp.employeeId, 5) as Zip,

    From transactions tran

    join employee emp on emp.employeeid = tran.employeeid

    where tran.TranDte between '7/13/2008' and '7/26/2008'

    The query returns the correct address at that specified time or TranDte.

    The udf_getHistory is a scalar that will effectively return the field value on that transaction date for that transaction record from a history table that contains change of addresses on a specific date.

    This of course is hitting the scalar 5 times per row and each time it runs, it takes about 76 milliseconds x 5 = almost 1/2 a second per row.

    Is there another way to effective date data in query with just joins?

    Thanks for any replies.

    Al Longobardi

  • Just join the history table adress directly... Of course, when you use a function to get 5 different columns like you are doing, things get pretty nasty slow.

    Just join the HistoryAdressTable directly using the date of the transaction, and the employeeID, and you are set.

    Cheers,

    J-F

  • Hey thanks for the quick reply,

    I would use a direct join. However, there are not enough records on a 1 to 1 relationship between the 2 tables. The history table contains only the changes to the address by date. So the join would have to be by employeeid and the where clause would have to bring 1 historical record to join to based on the date. So that wouldn't work.

    Example:

    History table

    employeeid updatedate address1 address2 city state zip

    --------------------------------------------------------------

    1 2008-07-13 123 NW.. .. New York NY 11001

    1 2008-08-15 78 E. .. .. New York NY 11002

    Transaction Table

    employeeid TranDte

    --------------------

    1 2008-07-14

    1 2008-08-16

    The query returns

    Employee TranDte Address1 Address2 City State Zip

    ------------------------------------------------------------------

    John Doe 07/12/08 123 NW.. New York NY 11001

    John Doe 08/15/08 78 E. New York NY 11002

    ps. I missed the TranDTE in the original query post from before.

    Is there another way of doing this in a join?

    Hope this helps make it clearer.

    Thanks in advance,

    AL

  • Can you post the query and table definitions?

    It would also help if you could post some accurate sample data, so we can help you build a query.

    It is probably possible to build this query, someone will show you how if you provide a easy to reproduce situation.

    the create tables with insert statements, and your actual query will do perfectly.

    Cheers,

    J-F

  • I will post in a few minutes...

    Thanks,

    AL

  • OK. Here is the code.

    Give it a try..

    Thanks,

    AL

    create table address_history

    (

    employeeid int,

    updatedate datetime,

    address1 varchar(100),

    address2 varchar(100),

    city varchar(50),

    state char(2),

    zip varchar(15)

    )

    insert into address_history

    select 1, '7/13/2008', '123 NW 3rd Ave','','New York', 'NY', '11001'

    union all

    select 1, '8/15/2008', '78 E. 5th Ave','','New York', 'NY', '11002'

    create table tranTable

    (

    employeeid int,

    tranDte datetime

    )

    insert into tranTable

    select 1, '7/14/2008'

    union all

    select 1, '8/16/2008'

    create function udf_getHistory

    (

    @InputDate datetime,

    @employeeid int,

    @Field int

    )

    RETURNS VARCHAR(400)

    AS

    BEGIN

    DECLARE @Result VARCHAR(400)

    DECLARE @AddressHistory TABLE

    (

    EmployeeID int,

    updatedate datetime,

    address1 varchar(100),

    address2 varchar(100),

    city varchar(50),

    state char(2),

    zip varchar(15)

    )

    INSERT INTO @AddressHistory

    SELECTah.* FROM address_history ah with (nolock),

    (SELECTemployeeid, MAX(updatedate) upddt

    FROMaddress_history with (NOLOCK)

    WHERECONVERT(VARCHAR,updatedate,101) <= @InputDate

    GROUP BY employeeid

    ) aa

    WHEREah.employeeid = aa.employeeid

    AND ah.updatedate = (SELECT max(updatedate)

    FROM address_history with (NOLOCK)

    WHERE employeeid = aa.employeeid

    AND CONVERT(VARCHAR,updatedate,101) <= @inputdate)

    AND ah.employeeid = @EmployeeID

    SELECT @Result =

    CASE

    WHEN (@Field = 1) THEN ah.address1

    WHEN (@Field = 2) THEN ah.address2

    WHEN (@Field = 3) THEN ah.city

    WHEN (@Field = 4) THEN ah.state

    WHEN (@Field = 5) THEN ah.zip

    END

    FROM @AddressHistory AS ah

    RETURN @Result

    END

    select t.employeeid, t.TranDte,

    dbo.udf_getHistory(t.TranDte, t.employeeid, 1) as Address1,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 2) as Address2,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 3) as City,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 4) as State,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 5) as Zip

    From tranTable as t

  • Ok, first, the function. You should get rid of it, and add the join directly in your query. The query that is in the function should be added to the select directly.

    Or, you could also just change the function to return a table, and cross join it, as if it was a table, but that will not be as performant. Look into BOL for function returning tables, because the parameter to get the right field out of the function is really not a good idea.

    Start by doing that, and we'll see if it's faster after.

    Cheers,

    J-F

  • How I am going to code the select statement adding the logic from the udf and get the correct history row with the fields?

    Could you rough it up for me? I can take care of the rest (I think).

    AL

  • Based on the sample data you provided, what is the expected output?

  • The output returned is correct:

    select t.employeeid, t.TranDte,

    dbo.udf_getHistory(t.TranDte, t.employeeid, 1) as Address1,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 2) as Address2,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 3) as City,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 4) as State,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 5) as Zip

    From tranTable as t

    12008-07-14 00:00:00.000123 NW 3rd AveNew YorkNY11001

    12008-08-16 00:00:00.00078 E. 5th AveNew YorkNY11002

    Let me know what you think and

    thanks in advance,

    Al

  • FYI, not everyone uses a case insensitive collation. It would be nice if your code was consistant in its use of case.

  • Could you explain? Please.

    I don't follow.

    AL

  • longobardia (1/30/2009)


    The output returned is correct:

    select t.employeeid, t.TranDte,

    dbo.udf_getHistory(t.TranDte, t.employeeid, 1) as Address1,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 2) as Address2,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 3) as City,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 4) as State,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 5) as Zip

    From tranTable as t

    12008-07-14 00:00:00.000123 NW 3rd AveNew YorkNY11001

    12008-08-16 00:00:00.00078 E. 5th AveNew YorkNY11002

    Let me know what you think and

    thanks in advance,

    Al

    Doesn't match your sample data. Also, would it be possible to get more than just two rows of data? That really isn't enough to properly test a query.

  • Never mind part of my comment above. I was looking at something else while writing it. Sorry, I'm a guy and have trouble multi-tasking sometimes. BUT more sample data would be helpful.

  • Here is everything that I have on my query session:

    create table address_history

    (

    employeeid int,

    updatedate datetime,

    address1 varchar(100),

    address2 varchar(100),

    city varchar(50),

    state char(2),

    zip varchar(15)

    )

    insert into address_history

    select 1, '7/13/2008', '123 NW 3rd Ave','','New York', 'NY', '11001'

    union all

    select 1, '8/15/2008', '78 E. 5th Ave','','New York', 'NY', '11002'

    create table tranTable

    (

    employeeid int,

    tranDte datetime

    )

    insert into tranTable

    select 1, '7/14/2008'

    union all

    select 1, '8/16/2008'

    alter function udf_getHistory

    (

    @InputDate datetime,

    @employeeid int,

    @Field int

    )

    RETURNS VARCHAR(400)

    AS

    BEGIN

    DECLARE @Result VARCHAR(400)

    DECLARE @AddressHistory TABLE

    (

    EmployeeID int,

    updatedate datetime,

    address1 varchar(100),

    address2 varchar(100),

    city varchar(50),

    state char(2),

    zip varchar(15)

    )

    INSERT INTO @AddressHistory

    SELECTah.* FROM address_history ah with (nolock),

    (SELECTemployeeid, MAX(updatedate) upddt

    FROMaddress_history with (NOLOCK)

    WHERECONVERT(VARCHAR,updatedate,101) <= @InputDate

    GROUP BY employeeid

    ) aa

    WHEREah.employeeid = aa.employeeid

    AND ah.updatedate = (SELECT max(updatedate)

    FROM address_history with (NOLOCK)

    WHERE employeeid = aa.employeeid

    AND CONVERT(VARCHAR,updatedate,101) <= @inputdate)

    AND ah.employeeid = @EmployeeID

    SELECT @Result =

    CASE

    WHEN (@Field = 1) THEN ah.address1

    WHEN (@Field = 2) THEN ah.address2

    WHEN (@Field = 3) THEN ah.city

    WHEN (@Field = 4) THEN ah.state

    WHEN (@Field = 5) THEN ah.zip

    END

    FROM @AddressHistory AS ah

    RETURN @Result

    END

    select t.employeeid, t.TranDte,

    dbo.udf_getHistory(t.TranDte, t.employeeid, 1) as Address1,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 2) as Address2,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 3) as City,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 4) as State,

    dbo.udf_getHistory(t.TranDte, t.employeeId, 5) as Zip

    From tranTable as t

    Here is the output from the select statement:

    12008-07-14 00:00:00.000123 NW 3rd AveNew YorkNY11001

    12008-08-16 00:00:00.00078 E. 5th AveNew YorkNY11002

    So what is it that doesn't match?

    If you want more data, just add to the insert statement UNION

    This is just a POC for you guys to run and look it over.

    I just need another workaround this performace hog.

    AL

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply