Snapshot of mirrored database taking a very long time when it shouldn't

  • Hi everyone,

    We are having issues taking a snapshot of a mirrored database.

    We are dropping the snapshot and then recreating it on a daily basis. This is after we do a data load on the db. The data load is finished and tlog backups taken and mirror checked to make sure its synchronized before the snapshot is taken. But sometimes it takes hours and then fails.
    The mirror is set to high safety (synchronous).
    It is my understanding that if there are no open transaction, the mirror is up to date, tlog backups have been taken etc, that the snapshot of that mirrored db should be almost instant.

    Why does it sometimes take over an hour and then fail?
    Later on in the day, I can run the create snapshot job, and it only takes a few seconds.

  • I think the snapshot needs to handle any open transactions coming from the primary. If you're sure there aren't any, then I think you might have some subsystem issues. There is the need to create a new file that is used to store the original pages as they changes.

    Not sure this helps, but have you checked fragmentation and file system integrity on the server?

  • Thanks Steve,

    We've been looking into it and we do have a large amount of unrestored log on the secondary side for the mirror.
    42,402,530 KB with an estimated restore time of 2.5 hours right now.
    What would be factors influencing the speed at which that unrestored log is restored?
    We do have a snapshot on the secondary which we take each day after the unrestored log has caught up.  Could that be impacting the restore time?
    If not, what other options do we have to increase performance?
    Having a look at the disk perfmon stats, I can see that the avg disk sec/write for the disk is hovering around 8ms. The avg disk write queue length around 2, and % idle time around 10%
    Those stats on the surface look not too bad to me, so is it really a disk subsystem issue?

    Any help would be greatly appreciated.

  • Restoring the log should be a CPU/RAM/disk usage. The system needs to read the log and redo or unwind transactions as appropriate. If this is taking a long time, something seems wrong. Especially if you don't see CPU/disk pressure. Can you restart the server and see if there might some process running on the side? Any AV?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply