Foreign Keys

  • Hugo Kornelis (4/12/2011)


    Tom.Thomson (4/11/2011)


    On a somewhat related topic, I occassionally wonder how best to put together a proposal to allow Foreign Keys to reference multiple columns (when I'm not wondering that it's because I've forgotten - again - that MS doesn't support this) - the current T-SQL syntax defined here doesn't allow multiple columns and this forces use of a surrogate key any time a natural key with more than one column is the natural target of a reference constraint.

    You should read the entire syntax descriptiopn.

    The first constraint syntax doesn't support creating a multi-column foreign key coonstraint, as that is the syntax for creating column-constrains. And no column-constraint can be multi-column. That syntax does not allow you to create a two-column primary key or unique constraint either.

    But if you scroll down, you'll find that the syntax for creating table-constraints allows you to create a multi-column foreign key constraint just fine. As David's example demonstrates.

    Thanks for reminding me - obviously my forgettery is working the opposite way to what I thought yesterday, it's when I'm thinking about putting togther a proposal that I've forgotten what MS does or doesn't support, not when I'm thinking they already support it.

    Tom

  • Tom.Thomson (4/12/2011)


    I think you've misunderstood horribly what I said.

    Maybe. Or maybe I just did not express myself clear enough.

    I'm not suggesting squeezing anything into a column: what I am saying is that by forcing a column to mean one of "this attribute is know to be applicable and this is its value", "the attribute is known to be applicable and its value is not present", or "this attribute is known to be inapplicable and its value is not present" - what Codd suggested in RM2 - you arrive at a situation where another column can't (without introducing a clear and direct inconsistency) permit the meaning "we don't know whether that attribute over there is applicable or not" because whatever value (or version of NULL) you put over there implies one of "we know it's applicable" and "we know it is not applicable", so the two columns would contradict each other.

    I don't really understand the argument you make here. With or without these RM2 suggestions, it is always possible to introduce inconsistent or even contradictory information in a database. That's what we have constraints for. In the example you give, the second column would be superfluous, because every conceivable state of the first column allows one and only one value in the second.

    The real reason I dislike both the RM2 model and the NULL-i / NULL-a idea becomes very clear by the phrases you use above to describe the three possibilities:

    * "this attribute is know to be applicable and this is its value" - there is an "and" in this phrase, which clearly indicates that it contains not one, but TWO facts. In this case, that is not really a problem, because one fact logically follows from the other. So you could rephrase this as a single fact plus an implied fact: "this is the value of this attribute (which implies that it is known to be applicable)".

    * "the attribute is known to be applicable and its value is not present" - again, the word "and" signals that there are two facts here. And in this case, rephrasing will not help: "the value of this attribute is not present (which implies nothing at all about it being applicable or not)" is correct, but not exactly the same information as the original phrase. However you describe it, this phrase combines two independent facts: the absence of a value of an attribute, and the applicability of the same attribute.

    * "this attribute is known to be inapplicable and its value is not present" - the same applies as for the previous phrase.

    So where the presence of a value does imply it's applicability, it's absence does not. If the applicability needs to be stored explicitly (which is not always the case; it may derive logically from other attributes), then a seperate attribute is needed, for the attribute and the applicability of the attribute are two different facts, that have to be stored in two different columns.

    There are three ways out: one is to have a single NULL whose sole meaning is "the value of this attribute is not present in your view of the database"; another column can then indicate reason (if any) for absence of the value; the other is use a different two null system (one null meaning inapplicable, the other meaning only "not present in your view"); and the third way is to use Codd's two NULLs and add a third which means only "not present in your view".

    I fail to see the need for a specific "not present in your view" marker. There either is or is not a "real" value for the attribute ("real" meaning "not NULL" - regardless of the number of different NULLs). If there is a value, then clearly it is present in my view. If there is no value, then clearly it is not present in my view. Or am I overlooking some situation where there is no value, yet it is not true that there it is not present in my view?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (4/12/2011)


    I fail to see the need for a specific "not present in your view" marker. There either is or is not a "real" value for the attribute ("real" meaning "not NULL" - regardless of the number of different NULLs). If there is a value, then clearly it is present in my view. If there is no value, then clearly it is not present in my view. Or am I overlooking some situation where there is no value, yet it is not true that there it is not present in my view?

    What you are missing is the possibility that the value is present in the database but not present in your view of the database. Whether such a situation can exist will depend on the security model enforced by the database, so whether the qualification "in your view" is needed is dependent on what that security model is. When the qualification is not needed, adding it doesn't change the meaning. When it is needed, failing to add it is seriously detrimental to the semantics.

    If you think that is strange, try considering a database in which you can see the pay rates of people in certain departments up to certain department-dependent levels, but not the pay rates of people at higher levels or in other departments; some companies have policies that require this sort of security to be forced. It can't be solved by making the pay rate attribute invisible to you, because you have to be able to see some pay rates (those on which you have a power of reccomendation or decision, and those which you are allowed to see to give you valid comparisons with the rates you control). There are also non-comercial applications where this type of security is required.

    You could also consider attributes affected by a mandatory security policy (what used to be the DoD coloured books, although that description is long out of date) at levels B2 or higher; there the classification of an attribute may depend on the process by which its value was determined, and that process may have been different for the corresponding attribute in different rows.

    Tom

  • Hugo Kornelis (4/12/2011)


    Tom.Thomson (4/12/2011)


    I think you've misunderstood horribly what I said.

    Maybe. Or maybe I just did not express myself clear enough.

    I think your new response makes it quite clear that you still misunderstand.

    I'm not suggesting squeezing anything into a column: what I am saying is that by forcing a column to mean one of "this attribute is know to be applicable and this is its value", "the attribute is known to be applicable and its value is not present", or "this attribute is known to be inapplicable and its value is not present" - what Codd suggested in RM2 - you arrive at a situation where another column can't (without introducing a clear and direct inconsistency) permit the meaning "we don't know whether that attribute over there is applicable or not" because whatever value (or version of NULL) you put over there implies one of "we know it's applicable" and "we know it is not applicable", so the two columns would contradict each other.

    I don't really understand the argument you make here. With or without these RM2 suggestions, it is always possible to introduce inconsistent or even contradictory information in a database. That's what we have constraints for. In the example you give, the second column would be superfluous, because every conceivable state of the first column allows one and only one value in the second.

    So it's quite clear: the second column would be superfluous because it is precluded from contradicting the other column; but the other column does not permit a world in which we don't know whether someone is applicable or not. So this version or the two sorts of NULL game prevents us from representing the fact that we don't know whether something is applicable or not (for example whether we know whether or not someone for whom no data of death is recorded is dead or not).

    The real reason I dislike both the RM2 model and the NULL-i / NULL-a idea becomes very clear by the phrases you use above to describe the three possibilities:

    If it isn't absolutely clear to you from my last message that I don't like the RM2 model of NULLs then your level of English comprehension is vastly inferior to what I believe it to be - I will go as far as to say that if you claim that that dislike is not clear to you I have to conclude that you are intentionally searching for a way to misinterpret my message.

    * "this attribute is know to be applicable and this is its value" - there is an "and" in this phrase, which clearly indicates that it contains not one, but TWO facts. In this case, that is not really a problem, because one fact logically follows from the other. So you could rephrase this as a single fact plus an implied fact: "this is the value of this attribute (which implies that it is known to be applicable)".

    * "the attribute is known to be applicable and its value is not present" - again, the word "and" signals that there are two facts here. And in this case, rephrasing will not help: "the value of this attribute is not present (which implies nothing at all about it being applicable or not)" is correct, but not exactly the same information as the original phrase. However you describe it, this phrase combines two independent facts: the absence of a value of an attribute, and the applicability of the same attribute.

    I'm glad to see that despite claiming to disagree with me you have recognised the problem I was pointing out in Codd's proposal: this NULL-a is not useful unless it becomes part of a larger set, which would be detrimental to attribute atomicity. It is precidely this unsuitable definition that prevents an RM2 database from describing the real world (in which we may not know whether something whose value we don't have is inapplicable or not, even when we get a NULL of some sort as the result of an outer join).

    * "this attribute is known to be inapplicable and its value is not present" - the same applies as for the previous phrase.

    No it doesn't; the same applies as to the first phrase, not the second: there is only one independent fact here, the fact that the attribute is inapplicable; it is a consequence of inapplicability that no value is present.

    So where the presence of a value does imply it's applicability, it's absence does not. If the applicability needs to be stored explicitly (which is not always the case; it may derive logically from other attributes), then a seperate attribute is needed, for the attribute and the applicability of the attribute are two different facts, that have to be stored in two different columns.

    I think that is pretty much what I said in a previous message - you need to have this extra information in extra columns if you want a reasonable system. Of course it doesn't matter whether columns are introduced with the sole purpose of conveying that information or whether it falls out as a consequence of other information.

    Tom

  • Great question.

    I also thought that FKs were made for PKs only. :w00t:

    Thank you.

    Best regards,

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

Viewing 5 posts - 46 through 49 (of 49 total)

You must be logged in to reply to this topic. Login to reply