Help Required Pleased: I don't even know how to pharase the question to search for an answer myself :-(...

  • Hi All

    Imagine i have a query returning two columns (EventID, EventType) like this (I've added a 3rd commentary column to try and clarify the meaning of the returned rows. Its not actually returned!)

    EVENTID EVTYPE My Comment

    EVENT-01START X TYPE 1: Has a "Start X" AND a "END X" record

    EVENT-01END X TYPE 1: Has a "Start X" AND a "END X" record

    EVENT-02START X Type 2: Has a "Start X" BUT HAS NO 'END X' record

    EVENT-03START X TYPE 1: Has a "Start X" AND an "END X" record

    EVENT-03END X TYPE 1: Has a "Start X" AND a "END X" record

    EVENT-04START X TYPE 1: Has a "Start X" AND an "END X" record

    EVENT-04END X TYPE 1: Has a "Start X" AND a "END X" record

    What I wnt to do is to return a count from this subset of the records which are TYPE1 (i.e. that have a "STARTX" AND an "ENDX" record for a given "EventID" value

    and

    a count of the records which are TYPE2 (i.e. that have only a "STARTX" OR "ENDX" record for a given EventID

    In the subset there are 4 unique EVENTIDs (01,02,03,04) of which 3 EVENTIDs are of Type1 (01,03,04) and 1 EVENTID which is of TYPE 2 (03). I know this to be true because I can look at the next record and see if it closes the set ("StartX", "ENDX")for the given EventID but I'm damned If I can work out a way to do the aggregates

    I'd be really grateful for any pointers on how to most efficiently (as the actual subset is very large)

    Thanks

    Kinch

  • You are more likely to have someone assist you if you post your table definition, some sample data (which you have but not in a readily useable format).

    May I suggest clicking on the first link in my signature block and follow to learn how to do this quickly. The article also contains the T-SQL code to post typical data quickly and easily.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • The code for this can get pretty complex, so if you can provide consumable sample data like you'll find in the first link of my signature, that'd help us present you finalized code.

    In general, you're looking at using a ROW_NUMBER() to assign an ordered value to the records, then self-join on that result between that row and the previous one. From there, with both rows now on a single row, you can compare/contrast to find the necessary aggregation.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Hi there

    Excellent - a definite clue!

    I shall attempt to do it myself first (am a masochist) and failing that I'll post up some scripts

    Many thanks

    B

  • Ok I haven't given up yet I've posted the table and insert script below since there are currently people around willing to help right now 🙂

    Sorry for not posting it properly in the first place - I considered it a generic problem but thats's no excuse.

    As I mentioned before, for this simple generic example table I want to write a query that returns the following answer

    TYPE COUNT

    TypeA 3

    TypeB 1

    I will then try and apply the lessons learned from the solution to the many problems I'm going to need this for over the next couple of weeks

    Many Thanks guys - seriously

    B

    -- 01. Create Sample Table START --

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[tblEVENT](

    [EventID] [nvarchar](10) NOT NULL,

    [EventType] [nvarchar](50) NOT NULL,

    [MyComment] [nvarchar](150) NULL,

    CONSTRAINT [PK_tblEVENT] PRIMARY KEY CLUSTERED

    (

    [EventID] ASC,

    [EventType] DESC

    )WITH (PAD_INDEX = OFF,

    STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,

    ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    -- 01. Create Sample Table END --

    -- 02. Insert Samnple Data START--

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-01 ', N'START X ', N'This is TYPE01 - has a START and an END record for the EventID value')

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-01 ', N'END X ', N'This is TYPE01 - has a START and an END record for the EventID value')

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-02 ', N'START X ', N'** This is TYPE02 - has only a START or an END record for the EventID value')

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-03 ', N'START X ', N'This is TYPE01 - has a START and an END record for the EventID value')

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-03 ', N'END X ', N'This is TYPE01 - has a START and an END record for the EventID value')

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-04', N'START X', N'This is TYPE01 - has a START and an END record for the EventID value')

    INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-04', N'END X', N'This is TYPE01 - has a START and an END record for the EventID value')

    -- 02. Insert Samnple Data End--

  • And one other question... what does one call (if anything) such a problem? 🙂

  • Am I on the right track at all here?

    SELECT DISTINCT

    A.EventID,

    A.EventType,

    B.EventType

    FROM

    (select

    * ,

    row_number() OVER (order by EventID, EventTYpe asc) as RowNoA

    from dbo.tblEVENT

    ) as A

    inner join

    (

    select

    * ,

    row_number() OVER (order by EventID, EventTYpe asc) as RowNoB

    from dbo.tblEVENT

    ) as B

    ON

    A.RowNoA=B.RowNoB+1

  • BCLynch (6/21/2011)


    And one other question... what does one call (if anything) such a problem? 🙂

    Annoying. 😉

    Right Track: Kinda, use a single cte and join it twice. Also, EventType needs to be DESC otherwise END comes before START. Working out the code between a couple of hot topics at work.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I went jamming down the wrong road, sorry. Had another methodology in my head, but it's really overkill.

    Here's the typical way to do it:

    ;WITH LeftMethod AS

    (

    SELECT

    eid.EventID,

    CASE WHEN eSt.EventID IS NULL THEN 'Type 2'

    WHEN eEn.EventID IS NULL THEN 'Type 2'

    ELSE 'Type 1'

    END AS CountType

    FROM

    (SELECT DISTINCT EventID FROM tblEvent) AS eid

    LEFT JOIN

    tblEvent AS eSt

    ONeid.EventID = eSt.EventID

    AND eSt.EventType = 'START X'

    LEFT JOIN

    tblEvent AS eEn

    ONeid.EventID = eEn.EventID

    AND eEn.EventType = 'END X'

    )

    SELECT CountType, Count(*) FROM LeftMethod GROUP BY CountType

    The way I mentioned can work (using Row_number()), but it's a lot more coding.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig, excellent stuff - thanks a lot for taking the time. You're a cool guy.

    Cheers

    B

  • BCLynch (6/21/2011)


    Craig, excellent stuff - thanks a lot for taking the time. You're a cool guy.

    Cheers

    B

    Thanks. :blush: Glad to help. Let me know if anything in there seems confusing.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • There's another option to this. If the table is large and you have to scan it anyway, it might be more optimum to just make one pass at it and do it in an aggregate like this:

    ; WITH CTEEvents AS

    (SELECT

    EventID

    , SUM(CASE WHEN EventType = 'START X' THEN 1 ELSE 0 END) AS HasStart

    , SUM(CASE WHEN EventType = 'END X' THEN 1 ELSE 0 END) AS HasEnd

    FROM tblEVENT

    GROUP BY EventID

    )

    SELECT

    SUM(CASE WHEN HasStart > 0 AND HasEnd > 0 THEN 1 ELSE 0 END) AS Type1

    , SUM(CASE WHEN (HasStart + HasEnd) > 0

    AND (HasStart = 0 OR HasEnd = 0) THEN 1 ELSE 0 END) AS Type2

    FROM CTEEvents

    Todd Fifield

  • That's also a really neat solution and, funnily enough, just last night I was pondering how one would handle adding (for example) a "a DO X" in between my "BEGIN X" and "END X" dummy example, that is, testing for completenes of 2,3 or even n records. This way makes that pretty clear I think. I'm sure how to do it the the other way yet . Is it n+1 self-joins for n states I wonder? I'll find out soon enough 🙂

    B

  • BCLynch (6/24/2011)


    That's also a really neat solution and, funnily enough, just last night I was pondering how one would handle adding (for example) a "a DO X" in between my "BEGIN X" and "END X" dummy example, that is, testing for completenes of 2,3 or even n records. This way makes that pretty clear I think. I'm sure how to do it the the other way yet . Is it n+1 self-joins for n states I wonder? I'll find out soon enough 🙂

    B

    I gave you the quick and dirty way to achieve your goal. If you're going to have multiple states, do it in a single pass as Todd recommended. Multiple self-joins can chew on memory (or tempdb, depending on the optimization path) pretty quick.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig, I completely appreciate that the solution for 2 states is not the necessarily the optimal solution for n states and that you solved the stated problem.

    If my last post reads as critical or ungrateful then I'm sorry as certainly it wasn't so intended. I was really delighted with the code you posted and addapted it to immediate effect for my real 2 state problem

    Thanks again

    B

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply