Seeing logshipping degradation - any idea why?

  • Hello!

    I am logshipping from my production SQL 2005 server to a DR server offsite. There are 7 databases in sizes up to 150Gb. Due to business requirements, we are running every 15 minutes. This configuration seemed to run fine for a while, however recently the two largest databases (mostly, although occasionally more) have been dragging behind. The size of the logs that are shipped for the largest database typically are in the 50Mb range, however occasionally there will be one well over 100Mb. The standard ones load without issue, however the exceptions will take a while to load, and set the entire process back - we have seen up to 8-10 hours. There have been no changes to the server or the network between locations of late. Servers are both rebooted on a weekly basis. Is there something else potentially I could look at? Thanks for your help!

  • Have you identified what part(s) is taking so much time?

    Is the step to copy the file to your DR site part of the problem?

    Are you compressing the files?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • We are not compressing anything, no. The file copy is seldom the bottleneck. Yes, a large logfile takes a little while to copy, however that is not the place we are seeing the issue. The issue is completely in the restore job; restoring the file on the DR server into the DR version of the database.

    As another note, I do not believe it could be server related. The DR server was built to exactly the same specs as the production server, and both are 64 bit 2005 Standard edition servers, with 16 CPUs and 8Gb of memory. The issue almost appears to be a degredation ... somehow. I just cannot figure out what would be causing that type of behavior.

  • When you do the restore job are you using with standby. If you don't need to use the dr server as a reporting server then you don't need to use with standby which will slow it down, use norecovery in that case. If you are using standby and want to use it for reporting then check what you have configured for active connections. You cant restore if there are active connections. You can configure it to kill active connections but then users won't always be able to run reports. I'd guess active connections is what's causing it to take hours though.

  • We are actually using (Standby/Read-Only) as there is a group reporting against that server, opposed to against production. Of course, they cannot as the restores are taking too long for them to have a reporting window. DR is the main purpose, however we tried giving them an alternate solution for reporting, and that is not working. It does not seem to be an issue for the database to grab single user mode to do the restores. Really, our long term solution for the reporting users is a separate reporting database, however the issue we are having is a question for the DR side of things -- the databases are not staying in synch with production as they once had ... so trying to come up with a possible answer as to why.

  • There are a number of causes for what you are seeing. If you need to QUICKLY get this resolved, find a good performance tuning consultant and get them on board to assist.

    Outside of that I would first do a file IO stall and waitstats analysis to determine cause(s) of slowness/delay.

    Also consider getting backup compression in place (preferably Hyperbac (now owned by Red-Gate)) so you can move those files around more quickly yet still use native backup/restore commands.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply