Normalization

Question

Normalization

Viewing 15 posts - 61 through 75 (of 144 total)

You must be logged in to reply to this topic. Login to reply

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 1

Nice post.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

TomThomson SSC Guru Points: 104773 More actions · Answer 2

Steven993 (4/24/2011)
When you write Dace, I guess you mean Date.
I actually own almost every books Chris published. He wrote hundred of pages on NULL topic.
Maybe I'll come later and write something about the history of NULL in relational theory and why it's not allowed in relationland.

Ouch. Yes. My symptoms of old age are becoming more obvious.

There is not only one reason to exclude NULL from relational theory, they are a lot.
Here is my favorite one, quickly:
As we can both agree, relational theory is a branch of applied mathematics, it mainly take it roots from set theory and predicate logic.
A relation header can be seen as a predicate. It is the meaning of the relation.
Take this relation header:
SUPPLIER{S#, SNAME, CITY}
The predicate of the relation can be read as follow:
The SUPPLIER S# is named SNAME and is located in CITY.

Assuming <S#> is a key, that's only slightly different from my version:

For supplier S# the information we have about his name is indicated by SNAME and the information we have about his location is indicated by CITY.

More complicated - yes, because it represents the real world, not an ideal world with all the difficult bits ignored. But not at all difficult to understand.

Each attribute of the header is a variable of the predicate.
What is a tuple ?
A tuple is what you get when you assign values to the predicate's variables. Tuples are truth propositions.
Here are some propositions of that relation :
The SUPPLIER S127 is named Steven and is located in Paris.
The SUPPLIER S876 is named Chris and is located in London.

let's change that to

We have the information that the SUPPLIER S127 is named Steven and is located in Paris.

What append if you introduce NULL ?
The SUPPLIER S671 is named ??? and is located in Madrid.

We have no information about the name of supplier S671 and we have the information that he is located in Paris.

Is that a truth proposition ? Does it make sense ? No.

A very Date-like argument. Put up a straw man that effectively denies that NULL has any meaning, and then demonstrate as a consequence of this that using NULL resdults in meaninglessness. You can only arrive at that result by assuming as a premise what you are trying to prove. Codd must have written dozens of times that the meaning of NULL is that the information isn't available, as must many others. But Date (and apparently you) refuse to assign it that (or indeed any) meaning and then claim that because NULL makes no sense when you refuse to assign it any meaning it must be inherently wrong.

Concerning the address "problem".

As I pointed out, there isn't one - because one can always assign suitable defaults (or at least, in my experience of internation shipments using both goverment controlled postal services and other deliverers there isn't a problem) and the example was chosen to demonstrate that there are classes of thing where which attributes are meaningful varies greatly. Certainly it can be done with a single new relation, [address line], as you describe, provided that the optional components do occupy lines (I think Address Line is sometimes a bad idea for the name, actually, because sometimes they don't; but when in your address they always do it's not a bad idea).

This is a database design question.
It depends on the business rules you have to follow, to represent.
In France, the addresses are like that :
Steven Relational
16 rue de la Paix
75000 Paris
Name and FirstName.
Then one to 5 lines, containing whatever we want, street, house, region, door number, floor....
Then ZipCode and City
In databases we just want to save First name, Last name, ZipCode, City and the One to Five lines.
That's enough to send product to clients and bills to suppliers. No more details.
How to deal with the possible 5 address lines ?
I consider the address lines has multivalued properties of the Address.
Here is my database schema (primary key underlined, # for foreign keys):
Customer(cust_id, cust_lastName, cust_firstName)
Address(add_id, add_Name, add_zipCode, add_City, cust_id#)
AddressLine(add_id#, line_id, line_Content)
Then you can add the fact that a customer can have multiple address of different kind, etc.. no problem.
That's what I need 99% of the time.
If my business rules forces me to go deeper in details on address properties and manage 10, 18 or 50 properties like you proposes, I'll change my design and do something like that:
Address(add_id, add_Name, cust_id#)
Properties(pro_id, pro_Name) -- City, Street, ZipCode, Country, Floor, ....
AdressProperties(add_id#, pro_id#, ap_value)
I follow my business rules with simple design, I never see one NULL.

If wouldn't expect to see a null handling addresses other than in cases where you don't know all of the address and need to indicate that you don't know -"don't ship the goods because we have an incomplete shipping address, instead email and ask him to update us with the missing bits of the address" is an action that you sometimes have to take, it requires you to know that you have incomplete knowledge or of course you would never take it, so you either use a NULL or have an attribute that says "we don't know the value of that other attribute and it is a component of the shipping address", ie NULL for yourself instead of using one provided by the relational system. Do you have a way around that? A way that doesn't involve inventing a disguised NULL at the level above the relational level so that you may end up complicating life by inventing many different sorts of disguised NULL above the relational level, where in reality you can get by with maybe just one of maybe just two (whether it's one or two is another potentially amusing argument) in the relational level.

With good database design we can avoid NULL very easily.

In a world where there are never attributes that you need but don't know you can avoid null (but it isn't always easy). Unless you count having attributes that say "if we had NULL that other attribute over there would be NULL, but we don't like NULL so it can't be so we need this attribute here to tell you whether we have its value or not" as not having a NULL you can't avoid nulls in a world with missing information.

Tom

GSquared SSC Guru Points: 260824 More actions · Answer 3

Steven: You do realize that your address lines table violates 1NF, right? Rows are specifically ordered within the data (you even have a column for that), which Date says is a violation of 1NF (in "What First Normal Form Really Means", pp. 127–8).

Seems kind of "risky" to me, to go on about nulls being a violation of relational theory, whilst advocating violations of 1NF.

Tom is right that you either need to violate normal forms, or allow nulls, or build a table (relation) for each possible combination of columns.

As per Codd, there really need to be two kinds of nulls, and they both need to be allowed in relational data. One is "we don't know" and the other is "we know this has no value".

For example, I have no suffix on my name. If you have a "NameSuffix" column in a database, you can't easily indicate that it's definitely true that this should be empty in my case. For other people, you might have known positive values, like "Jr", "Sr", "III", et al. For others, you might not be certain if there's a value for that or not, and leave it null. But you really can't indicate "known empty" in any RDBMS that I know of. You can kind of fake it in string data with a zero-length string indicating "known no-value", but you can't do that for numeric, date, time, etc., columns.

If you avoid having a nullable "NameSuffix" style column, by violating 1NF and turning it into ordered rows, does a missing row mean there's definitely no value (in my case), or potentially missing data (e.g.: you simply don't know if Joe is Joe Jr or Joe Sr or just Joe, etc.)? In other words, all you've done is taken the "null" value and replaced it with an equally ambiguous "missing row" issue. It doesn't actually solve the problem, it just hides it. And hides it in a way that's hard to spot, because while a query of a table will easily show "hey, there's a NULL in the delivery date column" (for example), a query of a pivoted table won't leap out at the human eye as readily.

As a suggestion on the pivoted way of hiding nulls (your address lines table), I suggest adding flags for "1st element" and "last element". This helps with knowing not just the sequence, but also the completeness of the values. This helps with determing whether you have a complete dataset, or part of one. In the "NameSuffix" example, if you have "Smith" as position 2 and as "Last Element", then you know that there isn't a suffix. This solves the "known no-value" issue. Still violates 1NF, but at least it solves that other issue. I found that had practical value when I had to create a "PeopleNames" table with a similar structure, in order to handle names with elements outside of the usual Anglo-Saxon structures, and ended up going outside of normal forms into a more object-oriented approach (as you've done with address lines).

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 4

What the ??

What is violating 1NF, and what is your definition of 1NF ?

I give up, good luck guys 😉

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 5

GSquared (4/25/2011)
Steven: You do realize that your address lines table violates 1NF, right? Rows are specifically ordered within the data (you even have a column for that), which Date says is a violation of 1NF (in "What First Normal Form Really Means", pp. 127–8).

Are you seriously claiming that tuples that include numbers that can collectively be understood as any sort of sequence is a violation 1NF? That would be absurd and extraordinary but it isn't remotely true. It certainly isn't what Date means. 1NF is simply the definition of a relation, which means an unordered set of tuples. 1NF stipulates nothing about the values within attributes within tuples. As soon as you start placing restrictions on the type of values being used within relations then you are departing from the general notion of normalization and simply expressing design preferences. Those preferences may be good or bad or indifferent advice but they are nothing to do with what normalization is about.

SQL tables are of course not relation variables. By definition they have ordered columns, they permit duplicate rows and sometimes have columns with duplicate names, unnamed columns, etc, etc. So when we talk about normalization in SQL we need to be concerned first with whether a given table accurately represents a relvar or not. If a SQL table has unique column names, at least one uniqueness constraint and does not permit nulls then it accurately represents a relvar and satisfies the requirements of 1NF.

As per Codd, there really need to be two kinds of nulls, and they both need to be allowed in relational data. One is "we don't know" and the other is "we know this has no value".

There are more than two possible types of null however. At least several hundred seem to have been proposed (not sure if there is even an accurate tally of all of them). Probably there is no limit to the number of different types of null there could be. Fortunately we can always do without any of them, which is after all what the pioneers of the relational model, including Codd, did for at least the first decade or so of its history. It's also still what many people continue to do successfully today. Codd criticised the SQL implementation of nulls so it's particularly odd to see his name used to support arguments in favour of SQL-style nulls.

Nulls don't solve the problems of missing data. Neither Codd nor most other respected advocates of nulls claim that they do. Codd went to some trouble to point out the errors and contradictions that arise from the use of nulls, especially in SQL where they lead to inevitably problematic results. Those problems are not solved and probably cannot be - they can only be tolerated.

It's worrying that some people seem unable to understand how to design effective databases without nulls. Perhaps that's not true of the people contributing to this thread, but I do see and hear people claiming that the nulls are "inherent in the data" or inevitable because of the "real world" rather than being a design decision to be made by the database designer. It's worrying if database designers aren't trained how to design databases without nulls. It's worrying because database designers ought to take proper responsibility for the decision to use a null or not and that decision ought to be based on proper information about what the different options entail. They should not push the responsibility for their decisions onto the customer or the business problem or the data.

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 6

@tom.Thomson

Tom.Thomson (4/25/2011)
Assuming <S#> is a key, that's only slightly different from my version:
For supplier S# the information we have about his name is indicated by SNAME and the information we have about his location is indicated by CITY.
More complicated - yes, because it represents the real world, not an ideal world with all the difficult bits ignored. But not at all difficult to understand.

I don't see any difference between the two predicate, they have the same meaning.

What ideal world are you talking about..... what difficult bits ??? Seriously.

Tom.Thomson (4/25/2011)
let's change that to
We have the information that the SUPPLIER S127 is named Steven and is located in Paris.

You can do that, as I said the meaning is the same.

Tom.Thomson (4/25/2011)
What append if you introduce NULL ?
The SUPPLIER S671 is named ??? and is located in Madrid.
We have no information about the name of supplier S671 and we have the information that he is located in Paris.

You can't do THAT !!

You'r changing the predicate. You modified the meaning of the relation !

That's a totally different relation. That predicate dosen't apply to others proposition.

You'r actually describing the following relation :

SUPPLIER_WITHOUT_NAME{S#, CITY}

Which by the way, is a nice way to handle missing information...............................

Tom.Thomson (4/25/2011)
If wouldn't expect to see a null handling addresses other than in cases where you don't know all of the address and need to indicate that you don't know -"don't ship the goods because we have an incomplete shipping address, instead email and ask him to update us with the missing bits of the address" is an action that you sometimes have to take, it requires you to know that you have incomplete knowledge or of course you would never take it, so you either use a NULL or have an attribute that says "we don't know the value of that other attribute and it is a component of the shipping address", ie NULL for yourself instead of using one provided by the relational system. Do you have a way around that? A way that doesn't involve inventing a disguised NULL at the level above the relational level so that you may end up complicating life by inventing many different sorts of disguised NULL above the relational level, where in reality you can get by with maybe just one of maybe just two (whether it's one or two is another potentially amusing argument) in the relational level.

I'm not sure to understand but you add some business rules saying : There is a minimum number of addresses properties I have to know before shipping. If I don't have thoses properties, send email to customer.

Is that right ?

That's so easy.

Add a table to my schema, named Required properties, which contains the required properties......

Address(add_id, add_Name, cust_id#)

Properties(pro_id, pro_Name) -- City, Street, ZipCode, Country, Floor, ....

AdressProperties(add_id#, pro_id#, ap_value)

RequiredProperties(pro_id)

With a simple query you know if a specific address has all the properties to valid shipping.

In a world where there are never attributes that you need but don't know you can avoid null (but it isn't always easy). Unless you count having attributes that say "if we had NULL that other attribute over there would be NULL, but we don't like NULL so it can't be so we need this attribute here to tell you whether we have its value or not" as not having a NULL you can't avoid nulls in a world with missing information.

I'm not sure to understand, but as I said, good database design can avoid NULL. Give an example if you want.

And I will never use some default values, like string without characters, etc....

That's just another king of NULL.....

@david-2 Portas

It's good to read you, in all this nonsense. :hehe:

Thomas Abraham SSChampion Points: 10761 More actions · Answer 7

I think the point to remember about theory vs practice is that theory explains the ideal, and CONSEQUENCES of not following the rules laid out by it. Practice takes what it can from the ideal that can be implemented AND used, with some level of understanding, by the non-coddian unwashed.

No current commercial DBMS (of which I am aware) implements pure "relational" theory. They all cut some corners.

As users of these systems, it is up to us to compare them to the relational ideal and determine the cost of the corners cut.

Yes, I use second address lines in my database designs, but I know the potential problems I introduce when I do so.

Ideally, Chris Date's vision of an OO-Relational DB will emerge, with, among other things, a formally defined range for each attribute and an explicit value representing UNKNOWN, with operators to handle it. Until then, we're stuck with NULLs if we want to deal with the real world.

Don't even get me started about tristate logic. 🙂

[font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
Connect to me on LinkedIn

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 8

tabraham 36699 (4/26/2011)
Ideally, Chris Date's vision of an OO-Relational DB will emerge, with, among other things, a formally defined range for each attribute and an explicit value representing UNKNOWN, with operators to handle it. Until then, we're stuck with NULLs if we want to deal with the real world.

In "Chris Date's vision" (in relational theory in fact) there is no "explicit value representing UNKNOWN". There is no UNKNOWN.

Relational theory isn't an "ideal" thing that came from nowhere.

Every details is there for mathematical and practical reasons; for the logical user point of view.

Look at the Information Principle for example.

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 9

tabraham 36699 (4/26/2011)
I think the point to remember about theory vs practice is that theory explains the ideal, and CONSEQUENCES of not following the rules laid out by it. Practice takes what it can from the ideal that can be implemented AND used, with some level of understanding, by the non-coddian unwashed.

Theory is practical. If a hypothesis is demonstrated to be insufficient or incorrect in practice then it is simply wrong. The result of that knowledge is a revision to the theory so that it does work in practice. That is scientific method.

Thomas Abraham SSChampion Points: 10761 More actions · Answer 10

Theory is practical. If a hypothesis is demonstrated to be insufficient or incorrect in practice then it is simply wrong. The result of that knowledge is a revision to the theory so that it does work in practice. That is scientific method.[/quote]

I never said that theory wasn't practical. What isn't always practical (no matter how DESIRABLE it might be) is getting mere mortals to follow a simple set of instructions. Or, assuming that lack of mental ability isn't the problem, expediencies get in the way.

I do not have infinite time to solve all problems I face. Therefore, I implement suboptimal solutions to problems all the time. To do so, I evaluate many factors about the problem, such as: potential cost to me of not solving it at all, time to implement solution A, potential worst case cost of solution A, likelihood of worst case occurring, and so on for other solutions. Then, given my limited resources, I select the best cost/benefit pairing I can see, hoping that my estimates of costs and risks and probabilities are close to reality.

My boss does the same with me when I present him with a potential problem solution. Many times we implement a "good enough" solution. We know what we would LIKE to do, or perhaps SHOULD do. We just don't have the time/money/etc. to do it. (Sometimes that comes back to bite us in the *** as we have underestimated the long-term cost of the suboptimal solution selected. But, that's another issue for another day.)

How does this relate to theory vs practice? Well, Relational Theory can be thought of as a kind of optimal solution. The actual implementations of RDBMs are someone's idea of a good enough solution, given the constraints. When you select an RDBMs for use, you really have chosen to adopt one suboptimal solution in favor of all the others. Here's where knowledge of relational theory matters for someone that isn't developing a RDBMs: I need to know what can happen to my data given the corners cut for expediency in the design/implementation of the specific RDBMs (and interface tools/languages etc.) that I use. In fact, I need to use this in the selection process, as well as in daily use of the system.

Theory IS very practical. The implementation of it is NOT always so, but someone thought so when they did it. We do they best we can as mere mortals.

[font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
Connect to me on LinkedIn

GSquared SSC Guru Points: 260824 More actions · Answer 11

Steven993 (4/25/2011)
What the ??
What is violating 1NF, and what is your definition of 1NF ?
I give up, good luck guys 😉

Here's the definition I'm citing:

According to Date's definition of 1NF, a table is in 1NF if and only if it is "isomorphic to some relation", which means, specifically, that it satisfies the following five conditions:
1.There's no top-to-bottom ordering to the rows.
2.There's no left-to-right ordering to the columns.
3.There are no duplicate rows.
4.Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
5.All columns are regular [i.e. rows have no hidden components such as row IDs, object IDs, or hidden timestamps].

I already cited the source for this, but apparently researching that from the citation was too much work.

In reply to Tom's question about data about sequential numbers within the rows, no, that's obviously not what's being refered to here.

HOWEVER, what breaking the address data down into variable rows does is make non-atomic data where each row is actually dependent on row-sequencing to be able to reassemble it into a consumable form. This means row sequence is now actually part of the tuple definition, which does violate the whole basis of relational data in that row/column order is mechanically not necessary for the data to be used.

Is there anything really wrong with that? Heck no! But it is a violation of 1NF, and hence not a "relational" data structure. It actually follows the OODBMS model more closely, by assigning properties to objects, which require reassembly to become atomic data.

A tuple that has "sequence 1, value 'Joe'" isn't atomic data. It's not a tuple. It is, however, a valid property of a parent object.

It just seems odd to me to be arguing about the validity of using NULL in a database, based on an evangelical, orthodoxical interpretation of relational theory, whilst simultaneously advocating violations of the whole definition of "relation" and "tuple".

If it helps, think of it this way: The data in such a structure is not dependent on "the key, the whole key, and nothing but the key". It is also dependent on the value(s) in the row(s) before/after it. "Smith, position 2" isn't a complete datum. "Smith" can be a complete datum, atomic and dependent on nothing else, but "position 2" is incomplete, and depends on other rows, and the order of them, in order to be meaningful. Hence, non-atomic, hence not 1NF.

Again, I would like to stress that I consider this whole thing silly. It's a debate based on authority, not on provable natural law, and thus is arbitrary. Same as the argument about NULLs. NULLs are useful, so are ordered rows, in my opinion, and I view my opinion as being just as valid as anyone else's opinion. That's the way opinions work.

If the only way to avoid NULLs is to violate the very religion that is being cited as mandating that they are evil, then that's all I'm pointing out.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 12

Well we have a problem here.

I think you didn't understand what 1NF is.

Let's take data example to make it clear !!

What you are citing are the properties a TABLE must follow, how it must be read to be considered as a relation.

Address(add_id, add_Name, add_zipCode, add_City, cust_id#)

1 Personal 75000 Paris 3

2 Work 75000 Paris 3

AddressLine(add_id#, line_id, line_Content)

1 1 '16 rue de la Paix'

2 1 '4 rue de la Plaine'

2 2 '1st Floor, 2nd door'

That's 2 tables representing two relations.

What the definition of 1NF is saying is that there is no importance in the row order you see. Because a relation is a set of tuples, and a set dosen't have order.

I am obliged to order the row, to present them as a table. But in the relation behind that table there is no order !!!

AddressLine(add_id#, line_id, line_Content)

1 1 '16 rue de la Paix'

2 2 '1st Floor, 2nd door'

2 1 '4 rue de la Plaine'

AddressLine(add_id#, line_id, line_Content)

2 2 '1st Floor, 2nd door'

1 1 '16 rue de la Paix'

2 1 '4 rue de la Plaine'

Thoses tables above are differents tables !!!! Because obviously the row order is different.

But they represent the SAME relation. Beacause order dosen't matter.

This is what 1NF say about order. It has nothing to tell about the values.............

Same goes for attributes order.

.....

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 13

GSquared (4/26/2011)
Here's the definition I'm citing:
According to Date's definition of 1NF, a table is in 1NF if and only if it is "isomorphic to some relation", which means, specifically, that it satisfies the following five conditions:
1.There's no top-to-bottom ordering to the rows.
2.There's no left-to-right ordering to the columns.
3.There are no duplicate rows.
4.Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
5.All columns are regular [i.e. rows have no hidden components such as row IDs, object IDs, or hidden timestamps].
I already cited the source for this, but apparently researching that from the citation was too much work.
In reply to Tom's question about data about sequential numbers within the rows, no, that's obviously not what's being refered to here.
HOWEVER, what breaking the address data down into variable rows does is make non-atomic data where each row is actually dependent on row-sequencing to be able to reassemble it into a consumable form. This means row sequence is now actually part of the tuple definition, which does violate the whole basis of relational data in that row/column order is mechanically not necessary for the data to be used.

There is no sequence in the table structure being described by Steven. Self-evidently that's the case because SQL doesn't permit sequencing within tables and of course the relational model doesn't either. The relational model certainly does permit the relation described by Steve. If it is a relation then it is in 1NF. The "definition" you just cited simply explains some aspects of what a relation is: a data structure or a mathematical construction with the properties described - interpretation of the meaning of that structure is not what 1NF is about.

If the only way to avoid NULLs is to violate the very religion that is being cited as mandating that they are evil, then that's all I'm pointing out.

Now you are being silly. No one said nulls are "evil" or objected to them for "religious" or dogmatic reasons. The objections are about the practical disadvantages of nulls, the fact that they cause incorrect or contradictory results and break the principles of normalization. Those are reasons supported by Codd, Date and many other experts as well as a wealth of real practical evidence. Even Microsoft agrees (in case you value their opinion).

Let's keep the discussion to a sensible, practical evaluation of the pros and cons of nulls, rather than throw empty accusations around.

TomThomson SSC Guru Points: 104773 More actions · Answer 14

Steven993 (4/26/2011)
You can't do THAT !!
You'r changing the predicate. You modified the meaning of the relation !
That's a totally different relation. That predicate dosen't apply to others proposition.

Then you disagree with my previous changes that you said you agreed with? They too reworded things in terms of describing what information we have rather than describing something else. So why do you object to that rewording when there's a NULL around? Just because you don't like NULL, it seems; and that dislike, judging by the contrast between your reaction to the previous rewording of a case where there wasn't a NULL and your reaction to the rewording (with exactly the same semantic content as the previous rewording) where there is a NULL, has no rational basis.

The relation tells us what information we have, and that can happily include telling us that some information isn't available. That's exactly what Codd said the relation was about, too, so I don't think I'm out on some bizarre distant limb here.

You're actually describing the following relation :
SUPPLIER_WITHOUT_NAME{S#, CITY}
Which by the way, is a nice way to handle missing information...............................

And off we go right down the track leading to a ridiculous number of relations and and a complexity so great that there's no chance of seeing good results.

Tom.Thomson (4/25/2011)
If wouldn't expect to see a null handling addresses other than in cases where you don't know all of the address and need to indicate that you don't know -"don't ship the goods because we have an incomplete shipping address, instead email and ask him to update us with the missing bits of the address" is an action that you sometimes have to take, it requires you to know that you have incomplete knowledge or of course you would never take it, so you either use a NULL or have an attribute that says "we don't know the value of that other attribute and it is a component of the shipping address", ie NULL for yourself instead of using one provided by the relational system. Do you have a way around that? A way that doesn't involve inventing a disguised NULL at the level above the relational level so that you may end up complicating life by inventing many different sorts of disguised NULL above the relational level, where in reality you can get by with maybe just one of maybe just two (whether it's one or two is another potentially amusing argument) in the relational level.
I'm not sure to understand but you add some business rules saying : There is a minimum number of addresses properties I have to know before shipping. If I don't have thoses properties, send email to customer.
Is that right ?
That's so easy.
Add a table to my schema, named Required properties, which contains the required properties......
Address(add_id, add_Name, cust_id#)
Properties(pro_id, pro_Name) -- City, Street, ZipCode, Country, Floor, ....
AdressProperties(add_id#, pro_id#, ap_value)
RequiredProperties(pro_id)
With a simple query you know if a specific address has all the properties to valid shipping.

Sure. More tables, bigger joins, increased complexity. And a table telling you whether data in another table is complete or not, which is just a complex way of expressing NULL! If that's what you want, you can do it that way. As I said, you can always avoid NULL by having an attribute somewhere that says something isn't available. But I would rather be economical with complexity and use Occam's razor to reduce the number of entities that I have to consider, while you will happily introduce who knows how many extra relations and how much extra code just to avoid having NULL and using a multi-valued logic.

In a world where there are never attributes that you need but don't know you can avoid null (but it isn't always easy). Unless you count having attributes that say "if we had NULL that other attribute over there would be NULL, but we don't like NULL so it can't be so we need this attribute here to tell you whether we have its value or not" as not having a NULL you can't avoid nulls in a world with missing information.
I'm not sure to understand, but as I said, good database design can avoid NULL. Give an example if you want.

You appear to me to have demonstrated pretty clearly that bad database design can avoid NULL. But nothing you have written suggests to me that good database design can always avoid it.

And I will never use some default values, like string without characters, etc....
That's just another king of NULL.....

Oh, so that's something on which you disagree completely with Chris Date, is it? After all, it was he who claimed you can always avoid NULL by using default values.

Tom

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 15

Obviously you don't know what a predicate is, or you'r just ignoring predicate variables. Nice 😉

You'r calling Occam's razor on the number of tables ? What a joke... Why are you normalizing in EKNF ? Just make an Excel sheet for your datas.

Tom.Thomson (4/26/2011)
Oh, so that's something on which you disagree completely with Chris Date, is it? After all, it was he who claimed you can always avoid NULL by using default values.

Source ?

Always good to read you.