Pulling data from oracle, SSIS vs OPENQUERY

  • I have an interesting scenario here, I can't find what the problem is

    I pull data from oracle every night into a table. I truncate the table and populate it with the data from the table in oracle.

    I use an OleDB source and destination in SSIS and it takes over an hour to pull the table across.

    The step the SSIS package follows are as follows.

    Truncate Table

    Drop all indexes

    Pull the data accross

    Create the indexes.

    The pulling of the data takes over an hour. However if I do this through SSMS and I write a query as such

    INSERT INTO Table1

    SELECT * FROM OPENQUERY(OracleDB,'SELECT * FROM OracleTable')

    This takes about 4 minutes.

    Now 4 minutes vs over an hour in the same conditions is a bit of a gap. What would the reasons for this be. I don't think SSIS should perform that much worse than the OPENQUERY.

    It is doing and INSERT BULK in the background, but surely that can't be the reason.

    Any help would be appreciated.

    Regards.

  • On the destination do you have "Table or View - fast load" selected, along with a table lock? If you are inserting into a heap table you should be able to get minimal logging and a decent write speed.

    To troubleshoot a little further you might want to go into your data flow and take out the destination and any transformations and add a Row Count transform. Then just run the package reading the source data into the Row Count transform. That will give you a baseline for how quickly your package can read from the source and the balance of the hour run time is in the xforms and insert.

  • Thanks, I will give that a try and see what happens.

  • Ryan Naude (1/14/2009)


    Thanks, I will give that a try and see what happens.

    So could you try it?

    I use a similar system, get enourmous data from oracle to sqlserver.but IF SSIS takes 4 minutes openquery takes about 20 or more minutes for same insert transaction,for instance sure.

    this topic is very very interesting for me . i would like to learn your experience about this .

    thanks

  • Unfortunately due to project contraints I didn't get chance to try this. I changed the way SSIS was pulling the data to pull a small chunk every day as appossed to the entire table. This had to happen regardless, but I am still curious as to the problem.

  • Hi,

    Would be interesting to see if anyone knows how to handle that kind of things... I have exactly the same scenario with the exactly flip-flopped results though:

    - SSIS task runs Stored Procedure that uses OPENQUERY to read approx. 400,000 rows from linked server (Oracle) and insert them into SQL table (truncating destination first) - takes from 4 to 6 mins to complete

    - Trying to speed things up, I created a DTS import package in SSMS and saved it as dtsx. That dtsx now has SQL task to Truncate and Data Flow consisting of Source (reads from Oracle) and Destination (puts data into SQL) - takes forever now, I stopped it after 40 mins or so

    Anyone?

    Thanks!

  • Hi

    We have several Oracle tables that I going to extract to our MS SQL 2005 datawarehouse for business modeling.

    I just did one test where I created a DTS 2000 package and migrated it into SSIS.

    Running the SSIS package took less then 3 minutes. That waa on 17 million records.

    Trying to do the same with our etl tool provided by our vendor gives 18 minutes on the same extract.

    The latter one uses ODBC, while my SSIS package uses OLE DB.

    Looking forward to test our biggest table of 60 million records.

    Dan

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply