Data Drift

  • Hi SSC,

    The project I work on has a sort of weird setup. We run 3 data centers, hot/hot/hot and traffic is load balanced to each data center. The way we keep them in sync is to basically run any data loading and/or procedures in each data center, but we inevitably get drift. A proc will run out of sequence from another process... a data load job will run an out of date file in one data center, or any of a number of little things that can cause data to get out of sync between the three data centers.

    What ends up happening is each team ends up building some very custom SSIS/SQL/C#/etc. solution to resolve THEIR drift issus, but none of them work very well, and certainly don't work universally across all apps and tables.

    Everyone I've spoken to about SQL Replication says its great when it works, but it can be a nightmare to recover from if it goes awry.

    And that's where our conversations usually stop, and I think it's ridiculous that we don't have an answer to this question.

    Are there any suggestions, or great reads out there which can address how to run mutliple data centers and keep them consistant (or even eventually consistant is fine)?

    BTW, I'm a developer, not a DBA, so I'm not very knowledgable about things like always on, replication, network topology, etc.

    Thanks in advance for any thoughts on the topic!

    - Gabe

    Executive Junior Cowboy Developer, Esq.[/url]

  • Xedni (8/11/2015)


    Hi SSC,

    The project I work on has a sort of weird setup. We run 3 data centers, hot/hot/hot and traffic is load balanced to each data center. The way we keep them in sync is to basically run any data loading and/or procedures in each data center, but we inevitably get drift. A proc will run out of sequence from another process... a data load job will run an out of date file in one data center, or any of a number of little things that can cause data to get out of sync between the three data centers.

    What ends up happening is each team ends up building some very custom SSIS/SQL/C#/etc. solution to resolve THEIR drift issus, but none of them work very well, and certainly don't work universally across all apps and tables.

    Everyone I've spoken to about SQL Replication says its great when it works, but it can be a nightmare to recover from if it goes awry.

    And that's where our conversations usually stop, and I think it's ridiculous that we don't have an answer to this question.

    Are there any suggestions, or great reads out there which can address how to run mutliple data centers and keep them consistant (or even eventually consistant is fine)?

    BTW, I'm a developer, not a DBA, so I'm not very knowledgable about things like always on, replication, network topology, etc.

    Thanks in advance for any thoughts on the topic!

    - Gabe

    Have you considered "log shipping" ? I don't know if fixing problems with it is any more or less complex than fixing problems with replication, but I can't imagine not at least taking a look-see. However, the driver on this probably needs to be the knowledge that says just how much drift you can afford to have. Just how "in sync" do they need to be? What; exactly; constitutes "sufficiently in sync"? That should be what drives the decision.

  • Thanks, Steve.

    So let me ask a more general question, assuming many companies don't run solely out of a single data center, what do people REALISTICALLY do to keep things in sync? As I said, a lot of people I talked to about SQL replication seem to shy away from it. Log shipping sounds useful, but I get the feeling it's only part of the picture.

    Can anyone give me examples of how they've done it? Maybe what your "dream" configuration would be? Some of the pitfalls you've run into?

    Again, thanks in advance for anything you can share.

    Executive Junior Cowboy Developer, Esq.[/url]

  • Xedni (8/12/2015)


    Thanks, Steve.

    So let me ask a more general question, assuming many companies don't run solely out of a single data center, what do people REALISTICALLY do to keep things in sync? As I said, a lot of people I talked to about SQL replication seem to shy away from it. Log shipping sounds useful, but I get the feeling it's only part of the picture.

    Can anyone give me examples of how they've done it? Maybe what your "dream" configuration would be? Some of the pitfalls you've run into?

    Again, thanks in advance for anything you can share.

    I wish I could offer you a direct example, but I have yet to work in such a scenario. Decisions about such things often depend rather highly on the nature of the reason for maintaining multiple synched copies of the same data. A lot of companies have trouble justifying the often sizable expense associated with anything other than a DR plan that's well thought out, tested, and exercised often enough to get the wrinkles out. That's expensive enough. Having multiple operational copies is usually rather considerably more expensive. You may want to have your company contact Gartner Group, and find some actual data on who is doing what. It will have a cost, but for something this complex, it might well be worth it.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply