Best practices for replication (?) from Production to reporting dbs, also joins between on/off-prem dbs?

  • Hello. I have a set of Production OLTP databases and a set of internally developed and used applications/services, and I'm looking for ways to make the system more stable and perform faster.

    The first problem I'd like to solve is offloading the read-heavy queries to a reporting database. We would index this db for heavy reads. We can update our C# code and SQL objects to run queries against the reporting db (side question: should those reporting db queries all use WITH NOLOCK, to avoid blocking with the load operations?). What's the best way to get up-to-date data (doesn't have to be immediate, but not much lag) from the production dbs to the reporting dbs? Transactional replication would be good, but it can be flaky and difficult to manage, right? Mirroring leaves the target (reporting) db in recovery mode, so it can't be queried. Log shipping would require the target db to be in recovery mode while the logs are being applied, which would hurt the ability to query it during those times. What are the best/common practices for this situation?

    The second problem is our plan to move one of two main databases off-prem, to a data center. The two dbs have objects and ad-hoc queries that join between the dbs, and this works well with both dbs on the same instance, but I imagine performance will suffer tremendously if they work at all once those joins have to cross servers...and the WAN. I'm thinking of replicating the important/used tables from each db into a sibling db located on the other server. DB1 on Server1 gets a new DB1_REPLSubscriber on Server2, etc. DB1_REPLSubscriber gets no writes at all from the applications and services, it updates from its sibling via replication of some sort. All queries that run in DB2 on Server2 would be updated to join with DB1_REPLSubscriber, keeping those joins local, not across the WAN. Are there better ways of handling this scenario?

    Thanks for any help.

  • I'm liking replication for your requirements. You could also write a custom ETL job that runs frequently.

  • Thanks for the reply. Still researching, but transactional replication does seem like a good option.

  • TX Replication can be a pain. It's not the best solution, but I know a lot of people move to Availability Groups so they can get readable secondaries without the replication hassles.

  • Replication in 2008r2 is rock solid for me in my environment and we use it just for the purpose you are looking for, a read only copy for read only queries like a reporting server.

    The thing to note is to keep the publications as small as possible, only replicate the data you need and if you need your entire database then use multiple publications to keep the replication for that particular publication small.

    I would also recommend a separate distribution server if you are expecting heavy load.

  • Transactional replication is a viable choice but implementing, maintaining and troubleshooting Replication is not a trivial task. If you go that route with production servers then push subscriptions are the only way to go.

    If you are considering replication over a WAN I would only consider doing so if your network and WAN are solid. I once managed replication between Michigan and some town in Brazil with a flaky WAN connection and it was an awful experience.

    In my opinion - SSIS is really the way to go or even a SQL agent/Stored Proc solution. I think it would be easier to setup & maintain, and it is certainly easier to get help as there are more people that know SSIS than those that know replication.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply