load data and foreign keys

  • Since I am new to setting up a database on sql server 2008 r2 and creating tables that are referenced by foreign keys, I would like to ask the following about the 4 tables that I created below:

    1. The way I setup the foreign keys is this ok why or why not? Can you explain and/or point me a reference that will explain the better solution for me?

    2. In 3 tables the data will not change much after the data is initially loaded. However in the table called Rej_History, I will be loading data to that table daily. Thus for all the tables, can you tell me if I need to drop indexes every time I update the data. Basically can you tell me and/or point me to a reference that will tell me how to setup a script to load the data?

    CREATE TABLE [dbo].[Rej_Contacts](

    [MRC_Contact_ID] [int] IDENTITY(1,1) NOT NULL,

    [MRC_Prefix] [varchar](30) NULL,

    [MRC_Last_Name] [varchar](50) NULL,

    [MRC_First_Name] [varchar](50) NULL,

    [MRC_Phone_Number] [varchar](25) NULL,

    [MRC_Email] [varchar](150) NULL,

    [MRC_Address] [varchar](100) NULL,

    [MRC_City] [varchar](50) NULL,

    [MRC_State] [varchar](2) NULL,

    [MRC_Zip] [varchar](10) NULL,

    [MRC_Update_Date] [datetime] NULL,

    [MRC_Updated_By] [varchar](50) NULL

    CONSTRAINT [PK_Rej_Contacts] PRIMARY KEY CLUSTERED

    (

    [MRC_Contact_ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    ------------

    CREATE TABLE [dbo].[Rej_History](

    [MRH_Id] [int] IDENTITY(1,1) NOT NULL,

    [MRH_Plan_Id] [int] NULL,

    [MRH_Create_Date] [datetime] NULL,

    [MRH_Code_ID] [int] NULL,

    [MRH_Tran_Count] [numeric](18, 0) NULL,

    [MRH_Batch_Size] [numeric](18, 0) NULL,

    [MRH_Tran_Code_Description] [varchar](max) NULL

    CONSTRAINT [PK_Rej_History] PRIMARY KEY CLUSTERED

    (

    [MRH_Id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[Rej_History] WITH NOCHECK ADD CONSTRAINT [PK_MRH_RNumber] FOREIGN KEY([MRH_Plan_Id])

    REFERENCES [dbo].[Rej_Plans] ([MRP_Plan_Id])

    GO

    ALTER TABLE [dbo].[Rej_History] WITH NOCHECK ADD CONSTRAINT [PK_MRH_Tran_Codes] FOREIGN KEY([MRH_Code_ID])

    REFERENCES [dbo].[Tran_Codes] ([TRC_Code_Id])

    GO

    ----------------

    CREATE TABLE [dbo].[Rej_Plans](

    [MRP_Plan_Id] [int] IDENTITY(1,1) NOT NULL,

    [MRP_PLan_Number] [varchar](10) NULL,

    [MRP_Contact_Id] [int] NOT NULL,

    [MRP_Parent_Organization_Name] [varchar](100) NOT NULL,

    [MRP_Update_Date] [datetime] NULL,

    [MRP_Updated_By] [varchar](50) NULL

    CONSTRAINT [PK_MRP_RNumber] PRIMARY KEY CLUSTERED

    (

    [MRP_Plan_Id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    ----------------

    CREATE TABLE [dbo].[Tran_Codes](

    [TRC_Code_Id] [int] IDENTITY(1,1) NOT NULL,

    [TRC_Tran_Code] [numeric](6, 0) NOT NULL,

    [TRC_Type] [char](1) NULL,

    [TRC_Title] [nvarchar](75) NULL,

    [TRC_Long_Definition] [varchar](max) NULL

    CONSTRAINT [PK_Tran_Codes] PRIMARY KEY CLUSTERED

    (

    [TRC_Code_Id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    -----------------

  • wendy elizabeth (3/29/2012)


    Since I am new to setting up a database on sql server 2008 r2 and creating tables that are referenced by foreign keys, I would like to ask the following about the 4 tables that I created below:

    1. The way I setup the foreign keys is this ok why or why not? Can you explain and/or point me a reference that will explain the better solution for me?

    In general I recommend using WITH CHECK, instead of using WITH NOCHECK as you have it. There are specific cases when WITH NOCHECK is an OK idea in some circumstances as an interim step, but eventually all FKs should be "checked" otherwise it can lead to suboptimal execution plans.

    2. In 3 tables the data will not change much after the data is initially loaded. However in the table called Rej_History, I will be loading data to that table daily. Thus for all the tables, can you tell me if I need to drop indexes every time I update the data. Basically can you tell me and/or point me to a reference that will tell me how to setup a script to load the data?

    Whether you drop and recreate indexes before and after you load data will depend on how important performance is for your process. I find that building indexes (any persistent DDL for that matter) in code to be tedious and a latent exposure for problems down the line. That said, some processes benefit greatly from this practice, but some see no measurable benefits at all. Test it both ways and see with which you are happier.

    Code re-posted with formatting:

    CREATE TABLE [dbo].[Rej_Contacts]

    (

    [MRC_Contact_ID] [int] IDENTITY(1, 1)

    NOT NULL,

    [MRC_Prefix] [varchar](30) NULL,

    [MRC_Last_Name] [varchar](50) NULL,

    [MRC_First_Name] [varchar](50) NULL,

    [MRC_Phone_Number] [varchar](25) NULL,

    [MRC_Email] [varchar](150) NULL,

    [MRC_Address] [varchar](100) NULL,

    [MRC_City] [varchar](50) NULL,

    [MRC_State] [varchar](2) NULL,

    [MRC_Zip] [varchar](10) NULL,

    [MRC_Update_Date] [datetime] NULL,

    [MRC_Updated_By] [varchar](50)

    NULL

    CONSTRAINT [PK_Rej_Contacts]

    PRIMARY KEY CLUSTERED ([MRC_Contact_ID] ASC)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    )

    ON

    [PRIMARY]

    ------------

    CREATE TABLE [dbo].[Rej_History]

    (

    [MRH_Id] [int] IDENTITY(1, 1)

    NOT NULL,

    [MRH_Plan_Id] [int] NULL,

    [MRH_Create_Date] [datetime] NULL,

    [MRH_Code_ID] [int] NULL,

    [MRH_Tran_Count] [numeric](18, 0) NULL,

    [MRH_Batch_Size] [numeric](18, 0) NULL,

    [MRH_Tran_Code_Description] [varchar](MAX)

    NULL

    CONSTRAINT [PK_Rej_History]

    PRIMARY KEY CLUSTERED ([MRH_Id] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    ON [PRIMARY]

    )

    ON

    [PRIMARY]

    GO

    ALTER TABLE [dbo].[Rej_History] WITH NOCHECK ADD CONSTRAINT [PK_MRH_RNumber] FOREIGN KEY([MRH_Plan_Id])

    REFERENCES [dbo].[Rej_Plans] ([MRP_Plan_Id])

    GO

    ALTER TABLE [dbo].[Rej_History] WITH NOCHECK ADD CONSTRAINT [PK_MRH_Tran_Codes] FOREIGN KEY([MRH_Code_ID])

    REFERENCES [dbo].[Tran_Codes] ([TRC_Code_Id])

    GO

    ----------------

    CREATE TABLE [dbo].[Rej_Plans]

    (

    [MRP_Plan_Id] [int] IDENTITY(1, 1)

    NOT NULL,

    [MRP_PLan_Number] [varchar](10) NULL,

    [MRP_Contact_Id] [int] NOT NULL,

    [MRP_Parent_Organization_Name] [varchar](100) NOT NULL,

    [MRP_Update_Date] [datetime] NULL,

    [MRP_Updated_By] [varchar](50)

    NULL

    CONSTRAINT [PK_MRP_RNumber]

    PRIMARY KEY CLUSTERED ([MRP_Plan_Id] ASC)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    )

    ON

    [PRIMARY]

    GO

    ----------------

    CREATE TABLE [dbo].[Tran_Codes]

    (

    [TRC_Code_Id] [int] IDENTITY(1, 1)

    NOT NULL,

    [TRC_Tran_Code] [numeric](6, 0) NOT NULL,

    [TRC_Type] [char](1) NULL,

    [TRC_Title] [nvarchar](75) NULL,

    [TRC_Long_Definition] [varchar](MAX)

    NULL

    CONSTRAINT [PK_Tran_Codes]

    PRIMARY KEY CLUSTERED ([TRC_Code_Id] ASC)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    )

    ON

    [PRIMARY]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • For tables with millions of rows you should really be careful with indexes, It can take literally hours (on commodity hardware) to delete say 10k rows from a heavily indexed table. In that case I'd go as far as dropping the indexes, doing you deletes/updates and recreating the indexes. It would still take less time and you'd get your indexes "defragmented" or rebuilt as a "free" bonus.

  • This history table will have one index based upon an identity column and 2 rows that are foreign keys to two other tables. there will be about 4,000 records loaded per day. Thus can you tell me and/or point me to a script that would tell me the best method to solve this problem?

  • Z1024 (3/29/2012)


    For tables with millions of rows you should really be careful with indexes, It can take literally hours (on commodity hardware) to delete say 10k rows from a heavily indexed table. In that case I'd go as far as dropping the indexes, doing you deletes/updates and recreating the indexes. It would still take less time and you'd get your indexes "defragmented" or rebuilt as a "free" bonus.

    "Literally hours" sounds "literally ridiculous". The hours part of it had to be attributed to SQL Server finding the rows in the first place, i.e. the JOIN or WHERE clause on the DELETE query was what was taking hours, not the actual deletion of the rows.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • wendy elizabeth (3/30/2012)


    This history table will have one index based upon an identity column and 2 rows that are foreign keys to two other tables. there will be about 4,000 records loaded per day. Thus can you tell me and/or point me to a script that would tell me the best method to solve this problem?

    The best method, only method really, is to test the different ways of loading using representative data on hardware similar to what you'll be using in production.

    How many rows are in the history table? Please post DDL for the history table, and format of the incoming feed.

    How important is it to get the absolute best performance out of the load process? For 4,000 rows, unless you are under a strict SLA, I would not worry much. Barring some extenuating circumstances involving page splits and LOBs you likely won't see enough of a difference across loading methods to warrant messing with the indexes.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • here in my shop we have a history table that we have no PK on. we just have FK's to the pertinent data and just insert to it.

    since it is based on the date time of the action we have a clustered index on the datetime and non clustered indexes that make sense for the queries we run against the table.

    Also since your table is going to be a history table why delete any thing from it? wouldn't that defeat the purpose of having a history table?


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • capn.hector (3/30/2012)


    here in my shop we have a history table that we have no PK on. we just have FK's to the pertinent data and just insert to it.

    since it is based on the date time of the action we have a clustered index on the datetime and non clustered indexes that make sense for the queries we run against the table.

    Also since your table is going to be a history table why delete any thing from it? wouldn't that defeat the purpose of having a history table?

    Wendy never mentioned deleting anything from her history table. The delete comment was from

    Z1024.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/30/2012)


    capn.hector (3/30/2012)


    here in my shop we have a history table that we have no PK on. we just have FK's to the pertinent data and just insert to it.

    since it is based on the date time of the action we have a clustered index on the datetime and non clustered indexes that make sense for the queries we run against the table.

    Also since your table is going to be a history table why delete any thing from it? wouldn't that defeat the purpose of having a history table?

    Wendy never mentioned deleting anything from her history table. The delete comment was from

    Z1024.

    quite right, time to get up and rest my eyes so i can see straight.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • opc.three (3/30/2012)


    Z1024 (3/29/2012)


    For tables with millions of rows you should really be careful with indexes, It can take literally hours (on commodity hardware) to delete say 10k rows from a heavily indexed table. In that case I'd go as far as dropping the indexes, doing you deletes/updates and recreating the indexes. It would still take less time and you'd get your indexes "defragmented" or rebuilt as a "free" bonus.

    "Literally hours" sounds "literally ridiculous". The hours part of it had to be attributed to SQL Server finding the rows in the first place, i.e. the JOIN or WHERE clause on the DELETE query was what was taking hours, not the actual deletion of the rows.

    Sorry, but I'm talking from my actual experience. On a client's DB deleting about 30k records from a ~5mil records history table was taking hours. Maybe more, because they never let it finish. And no, joins had nothing to do with these "literally ridiculous" deletion times. This is a well known problem affecting not only SQL Server but probably any other RDBMS as well.

    Basically when SQL Server deletes records from a table it places them in the transaction log and this slows down the deletion process dramatically. Indexes are good for selects and joins but in this case they are more of a burden.

    The solution is to delete records in smaller batches, about 1k rows each, and commit after each delete. Dropping indexes (that were not used in the delete join) further improved the deletion times. In the end it took about 10 or 20 min if memory serves me well.

    Granted, the hardware was very old, a 32bit OS with only 2 or 3GB of RAM, but still, this is a common problem so even better hardware would choke on a similar task, only the numbers would be bigger - say 100mil records table, 500k rows to be deleted.

  • capn.hector (3/30/2012)


    Also since your table is going to be a history table why delete any thing from it? wouldn't that defeat the purpose of having a history table?

    But it can't grow indefinitely, right? Quite often you only required to keep say up to 3 or 5 years of history data max, not 10years or more. They take up disk space, tape space, backup time etc. Now this might not apply in this particular case but something to keep in mind when dealing with very large tables.

  • Z1024 (3/30/2012)


    opc.three (3/30/2012)


    Z1024 (3/29/2012)


    For tables with millions of rows you should really be careful with indexes, It can take literally hours (on commodity hardware) to delete say 10k rows from a heavily indexed table. In that case I'd go as far as dropping the indexes, doing you deletes/updates and recreating the indexes. It would still take less time and you'd get your indexes "defragmented" or rebuilt as a "free" bonus.

    "Literally hours" sounds "literally ridiculous". The hours part of it had to be attributed to SQL Server finding the rows in the first place, i.e. the JOIN or WHERE clause on the DELETE query was what was taking hours, not the actual deletion of the rows.

    Sorry, but I'm talking from my actual experience. On a client's DB deleting about 30k records from a ~5mil records history table was taking hours. Maybe more, because they never let it finish. And no, joins had nothing to do with these "literally ridiculous" deletion times. This is a well known problem affecting not only SQL Server but probably any other RDBMS as well.

    Basically when SQL Server deletes records from a table it places them in the transaction log and this slows down the deletion process dramatically. Indexes are good for selects and joins but in this case they are more of a burden.

    The solution is to delete records in smaller batches, about 1k rows each, and commit after each delete. Dropping indexes (that were not used in the delete join) further improved the deletion times. In the end it took about 10 or 20 min if memory serves me well.

    Granted, the hardware was very old, a 32bit OS with only 2 or 3GB of RAM, but still, this is a common problem so even better hardware would choke on a similar task, only the numbers would be bigger - say 100mil records table, 500k rows to be deleted.

    and how does that have any bearing on the OP's question of Foreign keys and loading data?? a history table should rarely have deletes.

    to the OP. i would not worry to much about changing the indexes depending on how the data is loaded. is it 4000 rows all at once or spread out through the day. our history table gets between 100,000 - 3 million rows of data added through out each day depending on activity. we just rebuild the indexes every night and let it go from there.

    EDIT : since this popped up while i was typing my last:

    Z1024 (3/30/2012)


    capn.hector (3/30/2012)


    Also since your table is going to be a history table why delete any thing from it? wouldn't that defeat the purpose of having a history table?

    But it can't grow indefinitely, right? Quite often you only required to keep say up to 3 or 5 years of history data max, not 10years or more. They take up disk space, tape space, backup time etc. Now this might not apply in this particular case but something to keep in mind when dealing with very large tables.

    when dealing with old data you take care of it during down times. whether over a Holiday break in usage of your DB or some other noticeable break and schedule the down time to take care of the removal of old data. before the down time you figure out the most efficient way to handle the problem to limit down time.

    with that said the OP can cross that bridge when she comes to it.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Z1024 (3/30/2012)


    opc.three (3/30/2012)


    Z1024 (3/29/2012)


    For tables with millions of rows you should really be careful with indexes, It can take literally hours (on commodity hardware) to delete say 10k rows from a heavily indexed table. In that case I'd go as far as dropping the indexes, doing you deletes/updates and recreating the indexes. It would still take less time and you'd get your indexes "defragmented" or rebuilt as a "free" bonus.

    "Literally hours" sounds "literally ridiculous". The hours part of it had to be attributed to SQL Server finding the rows in the first place, i.e. the JOIN or WHERE clause on the DELETE query was what was taking hours, not the actual deletion of the rows.

    Sorry, but I'm talking from my actual experience. On a client's DB deleting about 30k records from a ~5mil records history table was taking hours. Maybe more, because they never let it finish. And no, joins had nothing to do with these "literally ridiculous" deletion times. This is a well known problem affecting not only SQL Server but probably any other RDBMS as well.

    Basically when SQL Server deletes records from a table it places them in the transaction log and this slows down the deletion process dramatically. Indexes are good for selects and joins but in this case they are more of a burden.

    The solution is to delete records in smaller batches, about 1k rows each, and commit after each delete. Dropping indexes (that were not used in the delete join) further improved the deletion times. In the end it took about 10 or 20 min if memory serves me well.

    Granted, the hardware was very old, a 32bit OS with only 2 or 3GB of RAM, but still, this is a common problem so even better hardware would choke on a similar task, only the numbers would be bigger - say 100mil records table, 500k rows to be deleted.

    Sorry, but you must have been running into something else. The mechanics of deleting 10K rows will not take hours, seriously, no matter how lowly the hardware. You were either fighting against a bad WHERE-clause on your delete, major fragmentation, a trigger, or something else you were not aware of.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • capn.hector (3/30/2012)


    and how does that have any bearing on the OP's question of Foreign keys and loading data?? a history table should rarely have deletes.

    Fair enough, but that is my response to opc.three's "sounds literally ridiculous" comment. My intention was to warn the OP and any readers who might come here via a search of this potential issue with large table, not to argue with anybody. And that 3-liner seemed adequate.

  • opc.three (3/30/2012)


    Sorry, but you must have been running into something else. The mechanics of deleting 10K rows will not take hours, seriously, no matter how lowly the hardware. You were either fighting against a bad WHERE-clause on your delete, major fragmentation, a trigger, or something else you were not aware of.

    Sorry, but I was running exactly into what I described: Google search for: delete large number rows from big table

    As you can see, this is a very common problem on at least SQL Server, Oracle and MySQL and the usually recommended solution worked fine.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply