how to speed up replication for a huge batch job

  • one of our billing cycles results in over 40 million commands applied to the production database that needs to be replicated. so far my record is 14 hours to replicate it to our fastest subscriber after the end of the job.

    publisher, distributor and subscriber are all SQL 2005 64 bit with 32, 8 and 20 GB of RAM. The publisher to distributor part is fast and i'm trying to speed up the part to the subscriber.

    on the subscriber the disks are in RAID 1 + 0 or whatever it's called where it's RAID 1 volumes in a stripe. there is a separate RAID 0+1 set of disks for the transaction log.

  • I can only assume this is transactional repl? I would attempt the following:

    1) remove indexes from table at subscriber - reapply when delivery complete

    2) drop article from pub, when process complete at pub add article back & let snapshot agent bcp data to subscriber

    - only drawback - if you have a 200 mill row table & only 40 mill rows affected, all 200 will need to be delivered

    3) drop article from pub, use ssis to move only changed rows, add article back w/ nosync option

    my 2 cents...

    ChrisB MCDBA

    MSSQLConsulting.com

    Chris Becker bcsdata.net

  • this is a monthly process so i was hoping to speed it up while keeping it only as SQL replication and not dropping the replication. table is close to 300 million rows

    i want to avoid running a new snapshot because we audit the billing as soon as it is done and i was hoping to get the auditors out of the production server and onto a subscriber. but they need access to the data as soon as possible

  • Transactional replication isn't built for batch processing like this, hence the name 'transactional'. Your options are limited if you are forced to keep in trans repl. You will get some benefit from removing indexes at subscriber but best to look for other tools to solve. I bet your distributor is brought to it's knees when this process occurs...

    cb

    Chris Becker bcsdata.net

  • the old one was, but the new 64 bit one is nice. we combined several distributors to make this new one and it's faster than the old two combined

  • To speed up the commands from publisher to subscriber,

    1) You can drop the indexes if exists. This will give better performance.

    2) You can increase log reader agents batch size and distributor agents batch size.

    Thanks,

    Balaji L

  • I've found that having PULL subscriptions helps because the subscriber has to do alot of the work. With PUSH subscriptions the distributor has to do absolutely everything.

    I would also try editing your replication jobs so that they DON't run continuously. Set the logreader job to run every minute and the distribution job to run every 5 minutes and see if that helps.

  • if you can make that job into a set of stored procedures you can publish the "EXECUTION" of the stored procedure and it will effectively bypass the distribution database activity.


    * Noel

  • Yep, Noeld's suggestion will be by far the best option if it's suitable for your setup. Replicate the batch execution itself, not the changing data.

    Regards,

    Jacob

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply