Need to fill the Gaps with previous value

Question

Need to fill the Gaps with previous value

Viewing 12 posts - 31 through 41 (of 41 total)

You must be logged in to reply to this topic. Login to reply

Dwain Camps SSC Guru Points: 86913 More actions · Answer 1

ChrisM@Work (10/8/2012)
Jeff Moden (10/8/2012)
dwain.c (10/7/2012)[hrJeff - I don't think this solution works with your test harness.
I did say that I needed to make a change to get it to work and described the change.
dwain.c (10/8/2012)
...
You'll notice it's exactly the same as Nagaram's query - except for the number generator.
I can't find anything wrong with it?
My mistake! I didn't take into account that row ordering of the OP's query was different. Correct number of rows is returned by his, yours and my Quirky Update...
Jeff - Dwain was a bit cranky when he wrote that, his comb-over took off in the wind on the way to work 😀

BWAAHAHAHA! You're assuming I have enough to comb over!

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

cooljagadeesh SSC Eights! Points: 959 More actions · Answer 2

try this also

CREATE TABLE #SAMPLETABLE

(

DATECOL DATETIME,

WEIGHTS float

)

INSERT INTO #SAMPLETABLE

SELECT '08/09/2012',8.2 UNION ALL

SELECT '08/10/2012',9.4 UNION ALL

SELECT '08/14/2012',10 UNION ALL

SELECT '08/15/2012',9.6 UNION ALL

SELECT '08/16/2012',9.3 UNION ALL

SELECT '08/19/2012',9.7

declare @min-2 datetime,@max datetime,@WEIGHTS varchar(100)

select @min-2=min(DATECOL),@max=max(DATECOL) from #SAMPLETABLE

while @min-2<>@max

begin

select @WEIGHTS=WEIGHTS from #SAMPLETABLE where DATECOL=@MIN

if not exists (select 1 from #SAMPLETABLE where DATECOL=@MIN)

insert into #SAMPLETABLE select @min-2,@WEIGHTS

set @min-2=@MIN+1

end

select * from #SAMPLETABLE order by DATECOL

drop table #SAMPLETABLE

cooljagadeesh SSC Eights! Points: 959 More actions · Answer 3

CREATE TABLE #SAMPLETABLE

(

DATECOL DATETIME,

WEIGHTS float

)

INSERT INTO #SAMPLETABLE

SELECT '08/09/2012',8.2 UNION ALL

SELECT '08/10/2012',9.4 UNION ALL

SELECT '08/14/2012',10 UNION ALL

SELECT '08/15/2012',9.6 UNION ALL

SELECT '08/16/2012',9.3 UNION ALL

SELECT '08/19/2012',9.7

declare @min-2 datetime,@max datetime,@WEIGHTS varchar(100)

select @min-2=min(DATECOL),@max=max(DATECOL) from #SAMPLETABLE

while @min-2<>@max

begin

select @WEIGHTS=WEIGHTS from #SAMPLETABLE where DATECOL=@MIN

if not exists (select 1 from #SAMPLETABLE where DATECOL=@MIN)

insert into #SAMPLETABLE select @min-2,@WEIGHTS

set @min-2=@MIN+1

end

select * from #SAMPLETABLE order by DATECOL

drop table #SAMPLETABLE

Leifton SSC Veteran Points: 288 More actions · Answer 4

Hopefully there are still some people that get updates regarding this thread. I have a follow up question as I recently encountered a similar situation to the original problem. I have the exact root problem as this thread's original topic, but my issue is the requested date range generally contains tens if not hundreds of thousands of rows to house the vast amount of readings.

As possibly suspected, this slows down the solution's script considerably. Any thoughts on how to speed things up a bit?

Thanks a bunch,

-Leif

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 5

Leifton (9/24/2013)
Hopefully there are still some people that get updates regarding this thread. I have a follow up question as I recently encountered a similar situation to the original problem. I have the exact root problem as this thread's original topic, but my issue is the requested date range generally contains tens if not hundreds of thousands of rows to house the vast amount of readings.
As possibly suspected, this slows down the solution's script considerably. Any thoughts on how to speed things up a bit?
Thanks a bunch,
-Leif

There are may proposed solutions on this thread. Which one did you use? Also, if you could provide what the basic relevent parts of your table are and 10 or so rows of test data, that would be a big help. PLEASE see the first "Helpful Link" in my signature line below for the right way to do that to get the quickest help.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 6

cooljagadeesh (10/8/2012)
CREATE TABLE #SAMPLETABLE
(
DATECOL DATETIME,
WEIGHTS float
)
INSERT INTO #SAMPLETABLE
SELECT '08/09/2012',8.2 UNION ALL
SELECT '08/10/2012',9.4 UNION ALL
SELECT '08/14/2012',10 UNION ALL
SELECT '08/15/2012',9.6 UNION ALL
SELECT '08/16/2012',9.3 UNION ALL
SELECT '08/19/2012',9.7
declare @min-2 datetime,@max datetime,@WEIGHTS varchar(100)
select @min-2=min(DATECOL),@max=max(DATECOL) from #SAMPLETABLE
while @min-2<>@max
begin
select @WEIGHTS=WEIGHTS from #SAMPLETABLE where DATECOL=@MIN
if not exists (select 1 from #SAMPLETABLE where DATECOL=@MIN)
insert into #SAMPLETABLE select @min-2,@WEIGHTS
set @min-2=@MIN+1
end
select * from #SAMPLETABLE order by DATECOL
drop table #SAMPLETABLE

Sorry I missed this before. You should performance test this against some of the other solutions. You'll find that it doesn't do so well in most cases.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Leifton SSC Veteran Points: 288 More actions · Answer 7

Thanks for the speedy response Jeff,

My apologies for my response being late. Here is some further expounding on my problem...

I have a table that is structured as follows:

Create Table dbo.SensorHistory

(

[time] as DATETIME,

[SensorID] as INT,

[Value] as Float

)

Ihave a percent change system setup so the timestamp-value pair for a particular sensor is only saved if it has changed relative to its previous value by so many percent. As a result, there are times where some sensors have history and others do not. In order to align this data later, we need to fill in some gaps upon retrieval (we fill in gaps with the previously known value before a change. A sample of data contained in the historical table is as follows:

**ACTUAL**

'2013-09-24 08:20:00', 01, 1000

'2013-09-24 08:21:00', 01, 1003

'2013-09-24 08:23:00', 01, NULL

'2013-09-24 08:24:00', 01, 1009

'2013-09-24 08:25:00', 01, NULL

'2013-09-24 08:27:00', 01, NULL

**DESIRED**

'2013-09-24 08:20:00', 01, 1000

'2013-09-24 08:21:00', 01, 1003

'2013-09-24 08:23:00', 01, 1003

'2013-09-24 08:24:00', 01, 1009

'2013-09-24 08:25:00', 01, 1009

'2013-09-24 08:27:00', 01, 1009

Thanks,

-Leif

Dwain Camps SSC Guru Points: 86913 More actions · Answer 8

Leifton - For your case I might just use a correlated sub-query due to its simplicity and the fact that it has widespread understanding.

Create Table #SensorHistory

(

[time] DATETIME,

[SensorID] INT,

[Value] Float

);

INSERT INTO #SensorHistory

SELECT '2013-09-24 08:20:00', 01, 1000

UNION ALL SELECT '2013-09-24 08:21:00', 01, 1003

UNION ALL SELECT '2013-09-24 08:23:00', 01, NULL

UNION ALL SELECT '2013-09-24 08:24:00', 01, 1009

UNION ALL SELECT '2013-09-24 08:25:00', 01, NULL

UNION ALL SELECT '2013-09-24 08:27:00', 01, NULL;

SELECT [time], [SensorID]

,[value]=ISNULL(value,

(

SELECT TOP 1 [value]

FROM #SensorHistory b

WHERE a.[SensorID]=b.[SensorID] AND b.[time] < a.[time] AND

b.value IS NOT NULL

ORDER BY b.[time] DESC

))

FROM #SensorHistory a

GO

DROP TABLE #SensorHistory;

That doesn't mean it would be the fastest. Other solutions proposed could also be adapted, and I suggest you try a few to see which one works best for your data.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 9

dwain.c (9/24/2013)
Leifton - For your case I might just use a correlated sub-query due to its simplicity and the fact that it has widespread understanding.
Create Table #SensorHistory
(
[time] DATETIME,
[SensorID] INT,
[Value] Float
);
INSERT INTO #SensorHistory
SELECT '2013-09-24 08:20:00', 01, 1000
UNION ALL SELECT '2013-09-24 08:21:00', 01, 1003
UNION ALL SELECT '2013-09-24 08:23:00', 01, NULL
UNION ALL SELECT '2013-09-24 08:24:00', 01, 1009
UNION ALL SELECT '2013-09-24 08:25:00', 01, NULL
UNION ALL SELECT '2013-09-24 08:27:00', 01, NULL;
SELECT [time], [SensorID]
,[value]=ISNULL(value,
(
SELECT TOP 1 [value]
FROM #SensorHistory b
WHERE a.[SensorID]=b.[SensorID] AND b.[time] < a.[time] AND
b.value IS NOT NULL
ORDER BY b.[time] DESC
))
FROM #SensorHistory a
GO
DROP TABLE #SensorHistory;
That doesn't mean it would be the fastest. Other solutions proposed could also be adapted, and I suggest you try a few to see which one works best for your data.

That's good code for this task especially since I'd expect the number of NULLs in the [Value] column to be comparatively low.

The only thing that I'd change is to change the following line...

,[value]=ISNULL(value,

... to ...

,[value]=COALESCE(value,

...because ISNULL doesn't provide for the necessary "Short Circuit". The Correlated Sub Query (CSQ for short) is executed (as an Index Seek if one is available and as a Table Scan if one is not) n-1 times regardless of the number of NULLs. Changing it to COALESCE will cause it to "Short Circuit" if [Value] is not NULL just as if you had used a CASE statement. It's about the only thing that I don't like about ISNULL compared to COALESCE.

Guess I'll have to write a "Hidden RBAR" article on the subject.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Dwain Camps SSC Guru Points: 86913 More actions · Answer 10

Jeff - I see now how you manage to keep your post count so high. By working over time on holidays looking for old threads! 😛

Interesting point about ISNULL vs. COALESCE that I did not know.

Where in the world did you come up with such a thing?

Waiting to see that article published.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 11

dwain.c (12/1/2013)
Where in the world did you come up with such a thing?

Heh... after spending hours on posts like this one wondering why the hell the code won't short circuit like I want it to. 😛 A CASE statement worked a treat but I very much like the shorter code. I'd run into this problem in real life before but forgot about it until I spent an hour trying to figure out why your code wasn't short circuiting and then I kicked myself for forgetting. I don't care for COALESCE because of the datatype problems it can cause, the fact that it's a touch slower than ISNULL, and requires at least one cup of coffee for me to type correctly. 😀

To be honest, they could actually change COALESCE so that it doesn't short circuit (I believe that's as undocumented as the non-short circuit problem with ISNULL is) as it does now and we'd never know except for all the performance problems that would suddenly rear their ugly head. It would make the code much more bullet-proof if we used CASE like in the following...

--===== The CASE method DOES provide "Short Circuiting".

-- The CSQ is only executed when needed.

-- It's a bit messier than just using COALESCE, though

-- it may be more obvious to read/troubleshoot.

SELECT hi.[Time]

,hi.[SensorID]

, [Value] =

CASE

WHEN [Value] IS NOT NULL THEN [Value]

ELSE

( --=== The Correlated Sub Query (CSQ)

SELECT TOP 1 lo.[Value]

FROM #SensorHistory lo

WHERE lo.[SensorID] = hi.[SensorID]

AND lo.[time] < hi.[time]

AND lo.[Value] IS NOT NULL

ORDER BY lo.[time] DESC

)

END

FROM #SensorHistory hi

;

You'll be able to tell that I spent some time on this again from the following code (still kicking myself for forgetting in the first place). These are the final "demo" code snippets that I saved for posterity. They'll figure prominently in the article to come.

--===== Create the test table.

-- This is not a part of the solution.

CREATE TABLE #SensorHistory

(

[Time] DATETIME,

[SensorID] INT,

[Value] Float

)

;

--===== Populate the test table with some test data

-- This is not a part of the solution.

INSERT INTO #SensorHistory

([Time], [SensorID], [Value])

SELECT '2013-09-24 08:20:00', 01, 1000 UNION ALL

SELECT '2013-09-24 08:21:00', 01, 1003 UNION ALL

SELECT '2013-09-24 08:23:00', 01, NULL UNION ALL

SELECT '2013-09-24 08:24:00', 01, 1009 UNION ALL

SELECT '2013-09-24 08:25:00', 01, NULL UNION ALL

SELECT '2013-09-24 08:27:00', 01, NULL

;

--===== Create the expected Clustered Index or at least one that

-- will enhance the performance of code without causing

-- Page-Splits during inserts.

-- If this index is missing, then this IS a part of the solution.

CREATE UNIQUE CLUSTERED INDEX IXC_SensorHistory_Time_SensorID

ON #SensorHistory ([Time],[SensorID])

;

--===== The ISNULL method DOESN'T provide "Short Circuiting".

-- The CSQ is executed n-1 times, which can be very expensive.

SELECT hi.[Time]

,hi.[SensorID]

, [Value] =

ISNULL([Value],

( --=== The Correlated Sub Query (CSQ)

SELECT TOP 1 lo.[Value]

FROM #SensorHistory lo

WHERE lo.[SensorID] = hi.[SensorID]

AND lo.[time] < hi.[time]

AND lo.[Value] IS NOT NULL

ORDER BY lo.[time] DESC

)

FROM #SensorHistory hi

;

GO

--===== The COALESCE method DOES provide "Short Circuiting".

-- The CSQ is only executed when needed.

SELECT hi.[Time]

,hi.[SensorID]

, [Value] =

COALESCE([Value],

( --=== The Correlated Sub Query (CSQ)

SELECT TOP 1 lo.[Value]

FROM #SensorHistory lo

WHERE lo.[SensorID] = hi.[SensorID]

AND lo.[time] < hi.[time]

AND lo.[Value] IS NOT NULL

ORDER BY lo.[time] DESC

)

FROM #SensorHistory hi

;

GO

--===== The CASE method DOES provide "Short Circuiting".

-- The CSQ is only executed when needed.

-- It's a bit messier than just using COALESCE, though

-- it may be more obvious to read/troubleshoot.

SELECT hi.[Time]

,hi.[SensorID]

, [Value] =

CASE

WHEN [Value] IS NOT NULL THEN [Value]

ELSE

( --=== The Correlated Sub Query (CSQ)

SELECT TOP 1 lo.[Value]

FROM #SensorHistory lo

WHERE lo.[SensorID] = hi.[SensorID]

AND lo.[time] < hi.[time]

AND lo.[Value] IS NOT NULL

ORDER BY lo.[time] DESC

)

END

FROM #SensorHistory hi

;

GO

--===== The CROSS APPLY method DOESN'T provide "Short Circuiting"

-- because everything in the FROM clause is executed before anything else.

-- This is why a CSQ is sometimes much better than a CROSS APPLY.

-- To get it to "Short Circuit", you'd have to add a CASE or a similar

-- COALESCE to the CROSS APPLY.

SELECT hi.[Time]

,hi.[SensorID]

, [Value] = COALESCE(hi.[Value],ca.[Value])

FROM #SensorHistory hi

CROSS APPLY

(

SELECT TOP 1 lo.[Value]

FROM #SensorHistory lo

WHERE lo.[SensorID] = hi.[SensorID]

AND lo.[time] < hi.[time]

AND lo.[Value] IS NOT NULL

ORDER BY lo.[time] DESC

) ca ([Value])

;

GO

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Dwain Camps SSC Guru Points: 86913 More actions · Answer 12

Your answers to short-circuiting the CSQ is something I've been wondering about for quite some time, without really knowing how to go about testing for it.

So thanks!

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St