Multi Column Sort

Question

Multi Column Sort

Viewing 15 posts - 136 through 150 (of 180 total)

You must be logged in to reply to this topic. Login to reply

SwePeso SSC-Dedicated Points: 39725 More actions · Answer 1

I think that sample data is adjusted to OP needs.

OP fourth post (first on second page) still doesn't answer the question of "3, NULL, 9" sample data. OP just rephrase himself without giving a definitive answer.

This can also be interpreted as not all records should be returned. Just return all records that satisfies the rules and omit the ones that doesn't.

Until OP leaves a definitive answer to "3, NULL, 9" sample data enigma, we will not know.

N 56°04'39.16"
E 12°55'05.25"

john.arnott SSChampion Points: 11882 More actions · Answer 2

Since this seems to be a classic example of (whose signature line has this?) "You need an answer? NO. You need a question.", I'd like to offer a hypothetical case as a guess at the underlying requirements. I, like some people, can attack a problem more easily with a concrete model in mind than if it's presented in the abstract. I know you-all will let me know if I've mis-interpreted this.

Let's say that the columns each represent a real-time reading of cumulative rainfall from various measuring stations and that those individual metrics on any one row were gathered at the same time. Because those stations are in remote areas, sometimes they cannot send a reading. When that happens, there's a NULL for that station's column, but any subsequent reading that's recorded for that station will have a higher value than any previous value in that column.

Now, let's say that some time into the gathering of data, it's realized that the table was not provided with a timestamp column or identity column. The ID column is, for whatever reason, not to be trusted for ordering. Perhaps the designer assumed that since there'd be a row every day, it was known when the first row was recorded, and since the data for each column is always ascending, there'd be no need. Or maybe s/he assumed that data may be retrieved in the order inserted, a typical beginner's error with SQL. (Not necessarily what really happened with the OP's situation, but it's my hypothetical to make the nature of the data understandable).

So, now we need to report the data in sequence, and since there are NULLs in some rows for col1, we cannot simply order by Col1. Or any other column as the data stands.

****EDIT****

RBarry points out (in very next post after this) that in the original problem, the values in any one row are specfied as always being in ascending sequence. I would think that this is a peripheral characteristic of the data as given and may or may not affect the solution, depending on whether you want to allow for indeterminate order caused by impossible rows.

RBarryYoung SSC Guru Points: 143329 More actions · Answer 3

First, I think that it is important to understand that this is actually a SQL puzzle or theoretical problem. It is not a real-world problem. As such, the rules of math & comp-sci puzzles apply. In particular is "assume the problem is complete/answerable as stated."

Here is what the OP said:

The values always increment across the row such that, when the rows are in order, the values in all columns also increment top to bottom. There are random cells that do not have a value (zero in this case) and are not considered in the sort.

Poorly stated though it is, there is still only one logically consistent way to interpret this statement, which the following rules:

1) When the rows are "in order", values will monotonically increase left to right and top to bottom.

2) This places implicit restrictions on the values of columns within a row: they must always be increasing left to right.

3) This also place restrictions on the make-up of the datasets: It must be possible to order them this way. If not, then the meaning of "in order" is indeterminate

Note that there are no provisions in any of this that would allow for dropping or excluding any rows. Under the rules of puzzles & theoretical problems we must then conclude that this is not allowed.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

SwePeso SSC-Dedicated Points: 39725 More actions · Answer 4

I agree with most points, but I strongly disagree with creating sample data that fulfills OP's wanted output. I don't believe in some divine intervention that sample data is perfectly built the way OP want it sorted.

N 56°04'39.16"
E 12°55'05.25"

john.arnott SSChampion Points: 11882 More actions · Answer 5

RBarryYoung (4/8/2009)
. . . .
1) When the rows are "in order", values will monotonically increase left to right and top to bottom.
2) This places implicit restrictions on the values of columns within a row: they must always be increasing left to right.
3) This also place restrictions on the make-up of the datasets: It must be possible to order them this way. If not, then the meaning of "in order" is indeterminate
Note that there are no provisions in any of this that would allow for dropping or excluding any rows. Under the rules of puzzles & theoretical problems we must then conclude that this is not allowed.

I would think that this is a peripheral characteristic of the data as given and may or may not affect the solution, depending on whether you want to allow for indeterminate order caused by impossible rows. It is a valid point to consider, and I will edit my "hypothetical" post to note this aspect of the data.

RBarryYoung SSC Guru Points: 143329 More actions · Answer 6

john.arnott (4/8/2009)
RBarryYoung (4/8/2009)
. . . .
1) When the rows are "in order", values will monotonically increase left to right and top to bottom.
2) This places implicit restrictions on the values of columns within a row: they must always be increasing left to right.
3) This also place restrictions on the make-up of the datasets: It must be possible to order them this way. If not, then the meaning of "in order" is indeterminate
Note that there are no provisions in any of this that would allow for dropping or excluding any rows. Under the rules of puzzles & theoretical problems we must then conclude that this is not allowed.
I would think that this is a peripheral characteristic of the data as given and may or may not affect the solution, depending on whether you want to allow for indeterminate order caused by impossible rows. It is a valid point to consider, and I will edit my "hypothetical" post to note this aspect of the data.

Whether you allow for it or not, the correct order in any dataset that does not conform to these rules is "undefined" and thus meaningless. Tests are only valid on the datasets that conform to these rules, so there is no real point in testing any other datasets.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 7

Peso (4/8/2009)
I agree with most points, but I strongly disagree with creating sample data that fulfills OP's wanted output. I don't believe in some divine intervention that sample data is perfectly built the way OP want it sorted.

My generator generates the data according to the specified rules and is always unambigously, deterministically sortable. No divine intervention necessary. If you find any flaw in it please let me know.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

RBarryYoung (4/8/2009)
Peso (4/8/2009)
I agree with most points, but I strongly disagree with creating sample data that fulfills OP's wanted output. I don't believe in some divine intervention that sample data is perfectly built the way OP want it sorted.
My generator generates the data according to the specified rules and is always unambigously, deterministically sortable. No divine intervention necessary. If you find any flaw in it please let me know.

The relative position of nulls and data isn't entirely random 😛

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

john.arnott SSChampion Points: 11882 More actions · Answer 9

OK. If the puzzle's rules specify that the given data always has increasing values in the respective columns of each row, so be it. Does that affect one's solution? If you put the rows in such an order that every column is in order, disregarding nulls, does the left-to-right order (fixed in each row and not altered by the sort) matter to how you achieve the ordering?

With data that fits the rules, an ordered set might be:

col1 col2 col3 col4

1 2 _ 5

3 6 7 _

_ 7 8 9

4 _ 9 _

Would the solution algorithm be different with "illegal" data that could still be in order for every column?

col1 col2 col3 col4

_ 3 _ 2

3 5 _ 3

_ 6 2 _

4 _ 3 6

RBarryYoung SSC Guru Points: 143329 More actions · Answer 10

john.arnott (4/8/2009)
OK. If the puzzle's rules specify that the given data always has increasing values in the respective columns of each row, so be it. Does that affect one's solution? If you put the rows in such an order that every column is in order, disregarding nulls, does the left-to-right order (fixed in each row and not altered by the sort) matter to how you achieve the ordering?
With data that fits the rules, an ordered set might be:
col1 col2 col3 col4
1 2 _ 5
3 6 7 _
_ 7 8 9
4 _ 9 _
Would the solution algorithm be different with "illegal" data that could still be in order for every column?
col1 col2 col3 col4
_ 3 _ 2
3 5 _ 3
_ 6 2 _
4 _ 3 6

As far as I can tell, the left-to-right thing is just a distraction.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Christopher Stobbs SSC-Insane Points: 21098 More actions · Answer 11

Am I correct in saying the values need to stay in the same rows?

or could I sort this sample as follows:

Sample

Col1 Col2 Col3

9 1 6

null 7 2

8 null 4

result 1

Col1 Col2 Col3

1 2 4

6 7 8

9 null null

Or does it have to be like the this?

Result 2

Col1 Col2 Col3

1 6 9

2 7 null

4 8 null

This sounds like an awesome puzzle

----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley

:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]

SQL-4-Life

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

john.arnott (4/8/2009)
OK. If the puzzle's rules specify that the given data always has increasing values in the respective columns of each row, so be it. Does that affect one's solution? If you put the rows in such an order that every column is in order, disregarding nulls, does the left-to-right order (fixed in each row and not altered by the sort) matter to how you achieve the ordering?
With data that fits the rules, an ordered set might be:
col1 col2 col3 col4
1 2 _ 5
3 6 7 _
_ 7 8 9
4 _ 9 _
Would the solution algorithm be different with "illegal" data that could still be in order for every column?
col1 col2 col3 col4
_ 3 _ 2
3 5 _ 3
_ 6 2 _
4 _ 3 6

Hi John

With no solution yet for Barry's "killer data" I reckon we need every clue we can get 🙂

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

john.arnott SSChampion Points: 11882 More actions · Answer 13

Christopher Stobbs (4/9/2009)
Am I correct in saying the values need to stay in the same rows?
or could I sort this sample as follows:
Sample
Col1 Col2 Col3
9 1 6
null 7 2
8 null 4
result 1
Col1 Col2 Col3
1 2 4
6 7 8
9 null null
Or does it have to be like the this?
Result 2
Col1 Col2 Col3
1 6 9
2 7 null
4 8 null
This sounds like an awesome puzzle

By my understanding of the conditions of the puzzle, your sample data is illegal. Each row is to remain intact and the result has to find every column in ascending sequence after ignoring nulls. With your sample data, it's impossible to have both col2 and col3 in order at the same time.

SwePeso SSC-Dedicated Points: 39725 More actions · Answer 14

Well, I have thought out a proper use-case.

Think of the eight columns as a freeway with 8 lanes.

All 8 lanes are always open.

At the "end" of the freeway (starting from top to bottom), there comes a toll-booth. This is marked by the columns having a value.

You want to use the tool-booth where the cost is the lowest for passing by.

So you need to find the next record to continue the travel.

If you start with CorrectSequence 1, you see that toll-booh 1 and 2 are open. So you find next record which has at least one of toll-both 1 and 2 open. There can be other open too, but to pass toll-booth 1 or toll-booth 2 has to be used, and för any matching record you choose the one with the smallest difference in fees. Difference is between records.

So you find CorrectSequence 2 (by passing toll-booth 2). All traffic passes that toll-booth and spread over the 8 lanes again. When this section/record has ended (CorrectSequence 2) you find that now is toll-booths 2 and 3 are open. So you need to find another record which has at least toll-booth 2 or 3 open.

So you find CorrectSequence 3 because it has the smallest difference of fees. Traffic passes and spreads across all 8 lanes.

Then you find CorrectSequence 4 because that record has toll-booth 1 and 6 open. Traffic passes and then you find CorrectSequence 5 which has toll-booth 1 open and this toll-booth has the smalles difference of fees.

Does this explanation make anything clearer?

N 56°04'39.16"
E 12°55'05.25"

john.arnott SSChampion Points: 11882 More actions · Answer 15

Peso,

I'm not sure that I'm following your toll-booth analogy properly. I don't see how it fits with the data you posted as coming from the OP:

SELECT 1, 2.35, 3.01, 3.49, 4.25, 4.79, 5.36, 5.82, 6.31

UNION SELECT 2, null, 3.59, 4.32, null, 5.21, null, null, 7.12

UNION SELECT 3, NULL, null, null, 5.23, 5.41, 6.33, 6.89, 7.99

UNION SELECT 4, null, 5.46, 5.31, 5.92, null, 6.87, 7.34, null

UNION SELECT 5, 5.12, 5.66, 6.31, 6.92, null, 7.87, 8.34, null

By saying "toll booths 1 and 2 are open", do you mean you have non-null data in them? So then what? We find the row with the next higher value for col1 or col2? So far, so good -- we have found row 2. But then what happens... we see that in row 2 we have values in col2 and col3. If we just look for the next higher value in those two columns, we'll get row 4 and miss row 3.