Best way to handle Data Base merging

  • Hi All, great forum has been very helpful.

    I am using SQL 2005 and can't use the SQL Merge functions for a number of reasons I won't go into at this time.

    I have a database which gets copied and sent out onto remote servers. When these servers come back the files are pulled off them and loaded onto the SQL Server Instance where the original data base is located. This will be replaced by an XML export at some time in teh near future.

    Once loaded I need to bring all the new records / modified records into the primary data base. As this applications runs as an attachment to a vendor data base which I do not control I have to be very careful with relations ships between the data bases.

    Anyway, I am looking at generating a unique row identifier for each record. I was using a UniquieIdentifier BUT in testing I found two cases where the UniquieIdentifier was actually a duplicate of the primary data base.

    So I have been looking at using a ServerName Column (default value @@ServerName) and a Identity Column (Auto Number) as a composite primary key and then having an insert trigger which generates a value which is basically the ServerName + Identify Value.

    This will give me a certain UniqueID for each row regardless of server etc and mean I can simply bring in all new records and those which have a later updated by date.

    So my question is, do you think this is the best option / way ahead ? Or is there something else I should be looking at ?

    Any input or comments would be greatly appreciated.

  • Hi Brad,

    I think we might need a little bit more info on this, but if I understand you correctly:

    1. You send out a copy of the MASTER Db to remote sites; ie @ 00:00:01

    2. These remote sites DB rows gets updated/added/deleted

    3. Once a day, the remote site DB will be uploaded to the Master server ie; 23:59:59

    4. An Audit procedure then occurs between the uploaded and Master DB's

    5. then step 1. starts again.

    I suppose my questions would be logistical.

    a. Is there any reason why you can't use Replication Services?

    b. Why would you upload the complete DB and not just the changes, isolated by the Auditing procedure on both sides??

    | If in Doubt...don't do it!! |

  • Hi thlubbe,

    Thanks for the response.

    Ok, we have two DB's, one which is a vendor application and primary ( we will call DB1) and then we have another DB (DB2) which we own and have built to allow us to record additional information, transactions etc.

    DB1 raises the transactions and generates the keys which are used in DB2 to relate to the DB1 record. DB2 then stores additional info about the transaction in DB1 as well as having its own functionality.

    Now we often send teams out without any connectivity and they may be gone for many months. When they return we bring in all their transaction data and make any updates based on a heap of rules.

    As I do not control DB1 I can only work with what I have. DB1 does not generate any sort of external sequential key and while DB1 is away the DB1 back on the network continues to generate id which are the same ( as expected).

    I copy a lot of the info across from DB1 into DB2 on return so that I can ensure the records in DB2 which relate to DB1 don't get confused when the DB1 changes from the detached DB1 to the master DB1.

    Anyway the issue I have found with all the replication / merge services is that they do not take into account primary and foreign key constraints across multiple DB's.

    I can not make any additions to DB1's schema.

    I currently upload the complete DB because if in doubt having more data is a good thing. Once I have it all sorted I will only bring the changes in and then I will grow it to export the changes to XML and then import from XML. This will allow the users to merge the data from the client machine and not rely on the DBA to add it to the server etc.

    Does that makes sense ?

  • Hhmmm, very unique situation!!!

    But like all SQL queries, they usually are.

    "So I have been looking at using a ServerName Column (default value @@ServerName) and a Identity Column (Auto Number) as a composite primary key and then having an insert trigger which generates a value which is basically the ServerName + Identify Value.

    This will give me a certain UniqueID for each row regardless of server etc and mean I can simply bring in all new records and those which have a later updated by date.

    So my question is, do you think this is the best option / way ahead ? Or is there something else I should be looking at ?"

    It sounds like, according to the information you supplied, and as far as I can gather...that you are on the right track.

    We had a similar issue a while back, so we ended up implementing a sort of staging area for the ID's where you can reference, manage and synchronise them.

    This staging area exists in both DB's and the ID's are referenced whenever an insert/update/delete occur. This made it easy for us to synchronise at the end of each day.

    For example

    [SyncTable]

    DB1_ID (AutoIncrement)

    DB2_ID (AutoIncrement)

    Hope this helps.

    | If in Doubt...don't do it!! |

  • ok cheers, the biggest issue we run into is that at any time we may have 10 or so servers out and about for various periods which left at various times. We never know when they will be back or the next one will leave.

    Just to make it worse we have no connection to them once they have left so can only update / sync / merge when they return.

    We really hit issues if a new user is added when they are away as this is done in the vendor application and it will be given an id which we can not duplicate on the primary if it has been allocated since the server has been away.

    Oh the joy of working with Vendor systems ..

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply