January 11, 2013 at 3:42 pm
It's been a long time since I've dealt with complex database design, and in this case, I'd like to ask the opinions of others who are more current than I am.
I need to design a good structure to store and enforce the rules, if practical on two types of information.
Constraint: CLR is disallowed.
Constraint: SQL Server 2012 SP1 is the DB being targeted.
Physical object
No physical object relates to any other physical row.
Physical objects have at most one descendent virtual object row.
Physical objects have "Physical" type data, which is unrelated to the data virtual objects have (mostly non-overlapping)
Virtual object
All virtual objects have a parent of EITHER a physical object, OR a virtual object, never both.
No virtual object is its own parent.
All virtual objects relate either directly or indirectly (through ancestor virtual objects) to a physical object.
Virtual objects have zero or more descendent virtual object row(s).
Virtual objects have "Virtual" type data, which is unrelated to the data physical objects have (mostly non-overlapping)
I.e. example objects:
Phys1
Phys2
Phys3
Virt1 - parent Phys1
Virt2 - parent Phys2
Virt3 - parent Virt2
Virt4 - parent Virt3
Virt5 - parent Virt3
My initial idea is one I really don't like, as I don't like conditional field choice (use field A except when you use field B), but I hate fields that join to more than one table even more, which resulted in:
DB init
CREATE DATABASE [PhysVirtTest]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N'PhysVirtTest', FILENAME = N'YourPath\PhysVirtTest.mdf' , SIZE = 8MB , FILEGROWTH = 4MB )
LOG ON
( NAME = N'PhysVirtTest_log', FILENAME = N'YourPath\PhysVirtTest_log.ldf' , SIZE = 4MB , FILEGROWTH = 4MB )
GO
ALTER DATABASE [PhysVirtTest] SET COMPATIBILITY_LEVEL = 110
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_NULL_DEFAULT OFF
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_NULLS OFF
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_PADDING OFF
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_WARNINGS OFF
GO
ALTER DATABASE [PhysVirtTest] SET ARITHABORT OFF
GO
ALTER DATABASE [PhysVirtTest] SET AUTO_CLOSE OFF
GO
ALTER DATABASE [PhysVirtTest] SET AUTO_CREATE_STATISTICS ON
GO
ALTER DATABASE [PhysVirtTest] SET AUTO_SHRINK OFF
GO
ALTER DATABASE [PhysVirtTest] SET AUTO_UPDATE_STATISTICS ON
GO
ALTER DATABASE [PhysVirtTest] SET CURSOR_CLOSE_ON_COMMIT OFF
GO
ALTER DATABASE [PhysVirtTest] SET CURSOR_DEFAULT GLOBAL
GO
ALTER DATABASE [PhysVirtTest] SET CONCAT_NULL_YIELDS_NULL OFF
GO
ALTER DATABASE [PhysVirtTest] SET NUMERIC_ROUNDABORT OFF
GO
ALTER DATABASE [PhysVirtTest] SET QUOTED_IDENTIFIER OFF
GO
ALTER DATABASE [PhysVirtTest] SET RECURSIVE_TRIGGERS OFF
GO
ALTER DATABASE [PhysVirtTest] SET DISABLE_BROKER
GO
ALTER DATABASE [PhysVirtTest] SET AUTO_UPDATE_STATISTICS_ASYNC OFF
GO
ALTER DATABASE [PhysVirtTest] SET DATE_CORRELATION_OPTIMIZATION OFF
GO
ALTER DATABASE [PhysVirtTest] SET PARAMETERIZATION SIMPLE
GO
ALTER DATABASE [PhysVirtTest] SET READ_COMMITTED_SNAPSHOT OFF
GO
ALTER DATABASE [PhysVirtTest] SET READ_WRITE
GO
ALTER DATABASE [PhysVirtTest] SET RECOVERY SIMPLE
GO
ALTER DATABASE [PhysVirtTest] SET MULTI_USER
GO
ALTER DATABASE [PhysVirtTest] SET PAGE_VERIFY CHECKSUM
GO
ALTER DATABASE [PhysVirtTest] SET TARGET_RECOVERY_TIME = 0 SECONDS
GO
USE [PhysVirtTest]
GO
IF NOT EXISTS (SELECT name FROM sys.filegroups WHERE is_default=1 AND name = N'PRIMARY') ALTER DATABASE [PhysVirtTest] MODIFY FILEGROUP [PRIMARY] DEFAULT
GO
Old-fashioned way
-- The first try, without using HierarchyID - output still needs to be consolidated with FOR XML or other tricks
USE [PhysVirtTest];
DROP TABLE dbo.VirtEx1;
DROP TABLE dbo.PhysEx1;
CREATE TABLE dbo.PhysEx1
( ID INT IDENTITY(1,1) NOT NULL
,Data VARCHAR(32)
,CONSTRAINT PKP1 PRIMARY KEY CLUSTERED (ID)
);
CREATE TABLE dbo.VirtEx1
( ID INT IDENTITY(1,1) NOT NULL
,VirtParentID INT NULL CONSTRAINT FKVV1 FOREIGN KEY REFERENCES PhysVirtTest.dbo.VirtEx1(ID)
,PhysParentID INT NULL CONSTRAINT FKVP1 FOREIGN KEY REFERENCES PhysVirtTest.dbo.PhysEx1(ID)
,Data VARCHAR(32)
,CONSTRAINT CKOnePar1 CHECK ((VirtParentID IS NOT NULL AND PhysParentID IS NULL) OR (VirtParentID IS NULL AND PhysParentID IS NOT NULL))
,CONSTRAINT CKVDup1 CHECK (VirtParentID <> ID)
,CONSTRAINT PKV1 PRIMARY KEY CLUSTERED (ID)
);
CREATE UNIQUE NONCLUSTERED INDEX UNIVP1 ON VirtEx1
(PhysParentID)
WHERE PhysParentID IS NOT NULL;
-- Insert Good Data
INSERT INTO PhysEx1 (Data) VALUES ('Phys1'), ('Phys2'), ('Phys3');
INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,1,'Virt1'), (NULL,2,'Virt2'), (2,NULL,'Virt3'),(3,NULL,'Virt4'),(3,NULL,'Virt5')
-- Basic data display
SELECT * FROM PhysEx1
SELECT * FROM VirtEx1
-- Virtual object parental display
-- THIS TECHNIQUE ALLOWS STRICTLY LIMITED NESTING (but doesn't loop)
-- If you don't have ALL NULLs in the right-most ancestor column,
-- you may be missing some ancestors!
SELECT Vdesc.*
,COALESCE(Vpar4.Data, Ppar4.DATA, Vpar3.Data, Ppar3.DATA, Vpar2.Data, Ppar2.DATA, Vpar1.Data, Ppar1.DATA) AS PhysHost
,COALESCE(Vpar1.Data, Ppar1.DATA) AS Parent
,COALESCE(Vpar2.Data, Ppar2.DATA) AS GrandParent
,COALESCE(Vpar3.Data, Ppar3.DATA) AS GreatGrandParent
,COALESCE(Vpar4.Data, Ppar4.DATA) AS GreatGreatGrandParent
FROM VirtEx1 Vdesc
LEFT OUTER JOIN VirtEx1 Vpar1
ON Vpar1.ID = Vdesc.VirtParentID
LEFT OUTER JOIN PhysEx1 Ppar1
ON Ppar1.ID = Vdesc.PhysParentID
LEFT OUTER JOIN VirtEx1 Vpar2
ON Vpar2.ID = Vpar1.VirtParentID
LEFT OUTER JOIN PhysEx1 Ppar2
ON Ppar2.ID = Vpar1.PhysParentID
LEFT OUTER JOIN VirtEx1 Vpar3
ON Vpar3.ID = Vpar2.VirtParentID
LEFT OUTER JOIN PhysEx1 Ppar3
ON Ppar3.ID = Vpar2.PhysParentID
LEFT OUTER JOIN VirtEx1 Vpar4
ON Vpar4.ID = Vpar3.VirtParentID
LEFT OUTER JOIN PhysEx1 Ppar4
ON Ppar4.ID = Vpar3.PhysParentID
-- Test Bad Data prevention
INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,1,'DuplicatePhysParent')
INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,NULL,'NoParent')
INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(1,1,'TooManyParent')
INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(111,NULL,'VirtParentNotExist')
INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,111,'PhysParentNotExist')
Possible hierarchyid based, but this is my very first foray into hierarchyid, and trying to get the Data values for ancestors isn't as simple as I'd hoped.
-- The second try, using HierarchyID - output is still not coded at all
USE [PhysVirtTest];
DROP TABLE dbo.VirtEx2;
DROP TABLE dbo.PhysEx2;
CREATE TABLE dbo.PhysEx2
( ID INT IDENTITY(1,1) NOT NULL
,Data VARCHAR(32)
,CONSTRAINT PKP2 PRIMARY KEY CLUSTERED (ID)
);
CREATE TABLE dbo.VirtEx2
( ID HIERARCHYID NOT NULL
,PhysParentID INT NULL CONSTRAINT FKVP2 FOREIGN KEY REFERENCES PhysVirtTest.dbo.PhysEx2(ID)
,Data VARCHAR(32)
--,CONSTRAINT CKOnePar2 CHECK ((VirtParentID IS NOT NULL AND PhysParentID IS NULL) OR (VirtParentID IS NULL AND PhysParentID IS NOT NULL))
,CONSTRAINT PKV2 PRIMARY KEY CLUSTERED (ID)
);
CREATE UNIQUE NONCLUSTERED INDEX UNIVP2 ON VirtEx2
(PhysParentID)
WHERE PhysParentID IS NOT NULL;
-- Insert Good Data
INSERT INTO PhysEx2 (Data) VALUES ('Phys1'), ('Phys2'), ('Phys3');
-- Hierarchyid guideance from http://stackoverflow.com/questions/1352778/sql-2008-hierarchyid-with-multiple-root-nodes
INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(hierarchyid::GetRoot().GetDescendant(NULL,NULL),1,'Virt1')
INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(hierarchyid::GetRoot().GetDescendant((select MAX(ID) from VirtEx2 where ID.GetAncestor(1) = hierarchyid::GetRoot()),NULL),2,'Virt2')
DECLARE @VirtParent hierarchyid
DECLARE @VirtNew hierarchyid
SELECT @VirtParent = ID FROM VirtEx2 WHERE Data = 'Virt2';
INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(NULL,NULL),NULL,'Virt3')
SELECT @VirtParent = ID FROM VirtEx2 WHERE Data = 'Virt3';
SELECT @VirtNew = MAX(ID) FROM VirtEx2 WHERE ID.GetAncestor(1) = @VirtParent;
-- Method that works only for the first node: INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(NULL,NULL),NULL,'Virt4')
INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(@VirtNew,NULL),NULL,'Virt4')
SELECT @VirtNew = MAX(ID) FROM VirtEx2 WHERE ID.GetAncestor(1) = @VirtParent;
INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(@VirtNew,NULL),NULL,'Virt5')
-- Basic data display
SELECT * FROM PhysEx2
SELECT ID.ToString() AS IDString, ID.GetLevel() AS IDLevel, * FROM VirtEx2
-- ??? Advanced data display - parents and grandparents.
-- Test Bad Data prevention
--'DuplicatePhysParent'
--'NoParent'
--'TooManyParent'
--'VirtParentNotExist'
--'PhysParentNotExist'
Thank you all for your time in reading and/or replying.
January 11, 2013 at 7:46 pm
I haven't done any database design that really matters for a few years, but I would do something very different from your idea.
Lets start with fundamental basics.
I want the design to be fairly proof against new releases of the RDBMS software, in this case SQL Server; and I want not to have it preclude the use of recently added (eight years ago?) features by using bizarre configuration. So I would change this block of code to have ON in every case where tyou have OFF.ALTER DATABASE [PhysVirtTest] SET ANSI_NULL_DEFAULT OFF
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_NULLS OFF
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_PADDING OFF
GO
ALTER DATABASE [PhysVirtTest] SET ANSI_WARNINGS OFF
GO
ALTER DATABASE [PhysVirtTest] SET ARITHABORT OFF
GO
ALTER DATABASE [PhysVirtTest] SET CURSOR_CLOSE_ON_COMMIT OFF
GO
ALTER DATABASE [PhysVirtTest] SET CONCAT_NULL_YIELDS_NULL OFF
GO
ALTER DATABASE [PhysVirtTest] SET NUMERIC_ROUNDABORT OFF
GO
ALTER DATABASE [PhysVirtTest] SET QUOTED_IDENTIFIER OFF
GOI'm not sure that I haven't included a single ine that could sensibly have OFF, but certainly the vast majority should have ON.
Then why are there two tables (virtual objects and physical objects) instead of one? It doesn't help at all, and it does hinder! Also, what is the attitude to NULL? It seems to me that avoiding NULL is the only imaginable reason for having two tables instead of one, but your schema has NULLs anyway. Depending on how NULL-averse you are you might have more than one table, but the extra tables over and above the basic one table would be link tables, not base object tables.
So I would end up with something like CREATE TABLE dbo.PhysorVirtEx
( ID INT IDENTITY(1,1) NOT NULL Primary key Clustered
,ParentID INT NULL REFERENCES PhysorVirtEx(ID),
,Data VARCHAR(32)
,IsVirtual BIT NOT NULL,
,CONSTRAINT CKOnePar1 CHECK (ParentID is not NULL or Isvirtual = 0)
,CONSTRAINT CKVDup1 CHECK (VirtParentID <> ID)
);As you can see, I'm not so null-averse as to be silly about it.
Tom
January 13, 2013 at 8:08 am
Physical objects have at most one descendent virtual object row.
Virtual objects have zero or more descendent virtual object row(s).
Which might be more common, someone asking to which physical object does this virtual object relate?? or which virtual objects refer to this physical object?.
One more question, if a physical object has 10 virtual descendants which would be more common, a separation of 10 levels between the physical object and the leaf-level virtual object (i.e. each virtual object has only one descendant) or a separation of far less that 10, maybe 2 (e.g. physical object has one virtual child and that virtual object has 9 virtual object children).
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 14, 2013 at 9:23 am
L' Eomot Inversé (1/11/2013)
I haven't done any database design that really matters for a few years, but I would do something very different from your idea.Lets start with fundamental basics.
I want the design to be fairly proof against new releases of the RDBMS software, in this case SQL Server; and I want not to have it preclude the use of recently added (eight years ago?) features by using bizarre configuration. So I would change this block of code to have ON in every case where tyou have OFF.
ALTER DATABASE [PhysVirtTest] SET ANSI_NULL_DEFAULT OFF
...I'm not sure that I haven't included a single ine that could sensibly have OFF, but certainly the vast majority should have ON.
Then why are there two tables (virtual objects and physical objects) instead of one? It doesn't help at all, and it does hinder! Also, what is the attitude to NULL? It seems to me that avoiding NULL is the only imaginable reason for having two tables instead of one, but your schema has NULLs anyway. Depending on how NULL-averse you are you might have more than one table, but the extra tables over and above the basic one table would be link tables, not base object tables.
So I would end up with something like
CREATE TABLE dbo.PhysorVirtEx
( ID INT IDENTITY(1,1) NOT NULL Primary key Clustered
,ParentID INT NULL REFERENCES PhysorVirtEx(ID),
,Data VARCHAR(32)
,IsVirtual BIT NOT NULL,
,CONSTRAINT CKOnePar1 CHECK (ParentID is not NULL or Isvirtual = 0)
,CONSTRAINT CKVDup1 CHECK (VirtParentID <> ID)
);As you can see, I'm not so null-averse as to be silly about it.
First, thank you very much for responding.
The ANSI settings were 100% the defaults SQL 2012 gave because I didn't care about the sample database, but I wanted one for your ease in cleaning up after helping me.
I'll edit the first post to add this information about why two tables:
Physical and Virtual objects have mostly very different data, which in a real table will be sets of columns - the "Data" field here is merely a placeholder for simplicity. This is why I have two tables, so that non-key fields are facts about the key.
I apologize for not including that information originally.
opc.three (1/13/2013)
Physical objects have at most one descendent virtual object row.
Virtual objects have zero or more descendent virtual object row(s).
Which might be more common, someone asking to which physical object does this virtual object relate?? or which virtual objects refer to this physical object?.
One more question, if a physical object has 10 virtual descendants which would be more common, a separation of 10 levels between the physical object and the leaf-level virtual object (i.e. each virtual object has only one descendant) or a separation of far less that 10, maybe 2 (e.g. physical object has one virtual child and that virtual object has 9 virtual object children).
Thank you for responding as well!
I expect that going from physical to virtual, and from virtual to physical, will be more or less equally common. Both will convey valuable and necessary data for certain tasks, and all of those tasks will be required on a day to day basis.
Most often, a physical object will have from 0 to 6 direct virtual descendants at a leaf level (i.e. the first level down). Rarely, a physical object will have a completely unknown tree structure going 2-N levels down, where N is in actuality unbounded, but almost certainly less than 5. These cases cannot be easily predicted in advance - perhaps they're wide and deep, perhaps narrow and deep. I'd guess they'd be narrow and deep more often, but I cannot guarantee that.
Alas, the rare/exception cases are always a pain to deal with.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply