How to find number in series is not entered in table

Question

Post reply

How to find number in series is not entered in table

amitsingh308

Ten Centuries

Points: 1287
More actions
March 31, 2010 at 11:51 pm

#100124

Hello Every body
I have a table in which there is a column which hold a series of numbers 1-10000 entered by the users but problem is that some numbers are not entered such as 99, 755,6870,..... n so many it is very difficult to find these missing number manually. so is it possible to find these numbers with the help of query??

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 1

You can use tally table. Do a a not in with the tally table and your table.

You can read about tally table here - http://qa.sqlservercentral.com/articles/T-SQL/62867/

Ping back if you have any doubt.

- arjun

https://sqlroadie.com/

Paul White SSC Guru Points: 150457 More actions · Answer 2

There are many ways to do this (tally table as suggested by Arjun, recursive CTE, EXCEPT) here is a demonstration of one of these possible methods (EXCEPT):

-- For demonstration purposes only

USE tempdb;

GO

-- Drop the test table if it exists

IF OBJECT_ID(N'dbo.Test', N'U')

IS NOT NULL

DROP TABLE dbo.Test;

GO

-- Create the test table

CREATE TABLE dbo.Test

(

row_id INTEGER NOT NULL

PRIMARY KEY CLUSTERED

WITH (FILLFACTOR = 100),

);

GO

-- Add 10,000 rows numbered 1-10,000

INSERT dbo.Test

(row_id)

SELECT TOP (10000)

ROW_NUMBER() OVER (

ORDER BY (SELECT 0))

FROM master.sys.columns C1,

master.sys.columns C2,

master.sys.columns C3;

GO

-- Delete 50 random rows

-- and output the row_ids deleted

WITH Random (rn)

AS (

SELECT DISTINCT

TOP (50)

FLOOR(RAND(CHECKSUM(NEWID())) * 10000 + 1)

FROM master.sys.columns C1,

master.sys.columns C2,

master.sys.columns C3

)

DELETE dbo.Test

OUTPUT deleted.row_id

WHERE row_id = ANY

(

SELECT R.rn

FROM Random R

);

GO

WITH Tally (n)

AS (

SELECT TOP (10000)

ROW_NUMBER() OVER (

ORDER BY (SELECT 0))

FROM master.sys.columns C1,

master.sys.columns C2,

master.sys.columns C3

)

-- Find the missing row_ids

SELECT n

FROM Tally

EXCEPT

SELECT row_id

FROM dbo.Test T

-- Tidy up

DROP TABLE dbo.Test;

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

WayneS SSC Guru Points: 95405 More actions · Answer 3

You might want to check out this recent thread (click here). It seems to be very similiar to what you are asking.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 4

Paul, can you explain this? Especially, the ROW_NUMBER() OVER (ORDER BY (SELECT 0)) part. I'm sorry if i am bugging you.

SELECT TOP (10000)

ROW_NUMBER() OVER (

ORDER BY (SELECT 0))

FROM master.sys.columns C1,

master.sys.columns C2,

master.sys.columns C3;

- arjun

https://sqlroadie.com/

Paul White SSC Guru Points: 150457 More actions · Answer 5

Arjun Sivadasan (4/1/2010)
Paul, can you explain this? Especially, the ROW_NUMBER() OVER (ORDER BY (SELECT 0)) part. I'm sorry if i am bugging you.
SELECT TOP (10000)
ROW_NUMBER() OVER (
ORDER BY (SELECT 0))
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
- arjun

It's a shorthand way to generate a 'virtual' tally table.

It is a CROSS JOIN of three tables which are guaranteed to exist. The ROW_NUMBER just assigns a sequential number to each resulting row. The ORDER BY (SELECT 0) just means that I don't care about the ordering of rows. Row numbers are generated sequentially, and the TOP limits the processing to 10,000 numbers. Note that the TOP stops execution as soon as 10,000 is reached - the full CROSS JOIN is not fully processed, just the first 10,000 rows.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 6

I was wondering why you chose sys.columns. Thanks Paul. It makes sense now.

- arjun

https://sqlroadie.com/

andersonrj18 SSC Veteran Points: 264 More actions · Answer 7

andersonrj18

SSC Veteran

Points: 264

April 1, 2010 at 8:21 am

#1144019

Hehehe ..

The Paul's code is the best!!

!!!

Paul White SSC Guru Points: 150457 More actions · Answer 8

andersonrj18 (4/1/2010)
You might improve, but this is the idea!

Ha ha very funny...April 1^st I get it.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Alocyte SSChasing Mays Points: 617 More actions · Answer 9

Hi guys - great work determining missing rows.

I was wondering what the best way to go from the list of missing rows, to a list containing ranges of missing rows, is.

All the existing structure is true; but instead of results being 1, 3, 4, 5, 7 outputting two columns, [1,1], [3,5], [7,7], etc.

I have a very convoluted process that does the missing row determination as a left join from a known list of ID's; and an indexed Tally table (10mill rows). Taking this list of missing numbers (1,3,4,5,7) I do a self-join on it, with a - 1, to determine the minimum rowid's missing, and update this to a summary table. I do the same again to determine the maximum of the missing group, and update that to the summary table, after joining with a + 1. From this I have a list of range starts and range ends... This is applied to each row in the range begin and end rows by two update creating two tempory tables, with an Identity type id and a list of either the begins or the ends - then joining the missing rows with the results of a joining of the range-begin and range-end on id (as each beginning has a corresponding ending) - where the missing id is between the range-begin and range-end...

it works fast enough and all, but is very messy.

i attach code that can emulate the results I want; but the volume this is live on is up to 4 million rows... that makes it a tricky one.

Any suggestions, hints or ideas welcome...

Thanks

So long, and thanks for all the fishpaste 😉

Paul White SSC Guru Points: 150457 More actions · Answer 10

Alocyte (4/1/2010)
Hi guys - great work determining missing rows.

Thanks. Why not post this question in a new thread though?

I answered a very, very similar problem (ranges of groups) just recently.

I'll see if I can find it.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

andersonrj18 SSC Veteran Points: 264 More actions · Answer 11

not me my friend .. you are funny!!

Your code doesn't work properly..

Tell me, what happens if I have two missing row_id in sequence?

765 and 766..

your code will show me just 765, where's the 766 ...humm ..could you explain??

thanks!!

Paul White SSC Guru Points: 150457 More actions · Answer 12

andersonrj18 (4/1/2010)
Your code doesn't work properly...

Have you run it? Works fine for me 😛

Your cursor code seems to have disappeared. All good!

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

andersonrj18 SSC Veteran Points: 264 More actions · Answer 13

not worked, you know that!!

This is my code ... it works!!

create table #temp (id int)

declare @cont int

set @cont=1

while @cont <= 1000

begin

insert into #temp values (@cont)

set @cont=@cont+1

end

select t1.id from #temp t1

where not exists

( select 1 from #tb_temporaria t2

where t1.id = t2.id)

drop table #temp

Paul White SSC Guru Points: 150457 More actions · Answer 14

andersonrj18 (4/1/2010)
not worked, you know that!!
This is my code ... it works!!

:laugh: A WHILE loop is no better than your original cursor! :laugh:

Try it on a million rows - your code and mine.

See how you go. 😛

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi