No records in db after importing a 60GB db over the network

  • When I used the Import/export wizard to import a 60GB database, I received an error at the very end saying this:

    * Error 0xc0202009: Data Flow Task: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.

    An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "The transaction log for database 'ObjectenArch' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases".

    (SQL Server Import and Export Wizard)

    Result:

    - I can see the imported db in SQL Server

    - I now have a 35 GB .mdf file and 25GB .ldf on the HDD

    - But when I query the database I'm getting 0 records...

    Questions:

    - Is there a way to 'repair' the imported db? Since the data seems physically there on the hdd, and the row importing process went till 100%

    - Is there a smarter/better way to get the huge db imported?

    - How can I prevent the transaction log error from happening the next time? After the import there was still 6GB of free space left on the hdd

    - Why is the log_reuse_wait_desc column saying literally "NOTHING"?

  • ss_nl (7/23/2010)


    - Is there a way to 'repair' the imported db? Since the data seems physically there on the hdd, and the row importing process went till 100%

    It's not broken. Once the 'transaction log full' error occurred, the entire import would have rolled back, resulting in an empty DB. The files wouldn't have shrink back down though.

    - How can I prevent the transaction log error from happening the next time? After the import there was still 6GB of free space left on the hdd

    Could be it couldn't grow fast enough, could be the growth increment requested was more than 6GB and hence failed.

    - Why is the log_reuse_wait_desc column saying literally "NOTHING"?

    Because there is nothing now that's preventing the log space from being reused. The transaction failed and was rolled back. After that the log space was marked as reusable

    If you're copying the entire DB, why not backup/restore?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/23/2010)


    ss_nl (7/23/2010)


    - Is there a way to 'repair' the imported db? Since the data seems physically there on the hdd, and the row importing process went till 100%

    It's not broken. Once the 'transaction log full' error occurred, the entire import would have rolled back, resulting in an empty DB. The files wouldn't have shrink back down though.

    - How can I prevent the transaction log error from happening the next time? After the import there was still 6GB of free space left on the hdd

    Could be it couldn't grow fast enough, could be the growth increment requested was more than 6GB and hence failed.

    - Why is the log_reuse_wait_desc column saying literally "NOTHING"?

    Because there is nothing now that's preventing the log space from being reused. The transaction failed and was rolled back. After that the log space was marked as reusable

    If you're copying the entire DB, why not backup/restore?

    That explains... was kind of hoping all records would still be there but just the transaction logs being incomplete.

    I'm actually trying to copy one huge table from a network pc to a newly created database in SQL Server....

  • Questions:

    - Is there a way to 'repair' the imported db? Since the data seems physically there on the hdd, and the row importing process went till 100%

    - Is there a smarter/better way to get the huge db imported?

    [font="Arial"]You can create SSIS package with a dataflow and set the maximum commit insert size to 20000 or so in the ole db destination adapter so that rows would committed in batches instead of committing all the records in one batch.

    You can further improve the process by tweaking the SSIS process to pull only the records which is not in the target table basically restart ability feature.[/font]

    [font="Arial"]BASKAR BV
    http://geekswithblogs.net/baskibv/Default.aspx
    In life, as in football, you won’t go far unless you know where the goalposts are.[/font]

  • Baskar B.V (7/23/2010)


    Questions:

    - Is there a way to 'repair' the imported db? Since the data seems physically there on the hdd, and the row importing process went till 100%

    - Is there a smarter/better way to get the huge db imported?

    [font="Arial"]You can create SSIS package with a dataflow and set the maximum commit insert size to 20000 or so in the ole db destination adapter so that rows would committed in batches instead of committing all the records in one batch.

    You can further improve the process by tweaking the SSIS process to pull only the records which is not in the target table basically restart ability feature.[/font]

    Ok, I've now created a SSIS package, with a maximum commit insert size set to 20000.

    Is it also possible/necessary to adjust the max size of the transaction log? If yes, where can i do that?

    And will this dataflow task be able to resume in case the connection breaks or the data source pc is being restarted?

  • Is it also possible/necessary to adjust the max size of the transaction log? If yes, where can i do that?

    And will this dataflow task be able to resume in case the connection breaks or the data source pc is being restarted?

    You need to check your database properties if you want to increase or add more files to the transaction log.

    For restartability, what you can to do is identify a incremental field (like updt_dt, timestamp) or primary key in source table and then pull incremental records using a query something like below and pump that into your target table.

    Suppose if source table is table1 and target is table2 then the source query is something like below...

    select * from <table1> where <table1.field > select max(table2.field) from table2

    if both the databases are in different server, you can store the resultset of the query in a variable using sql-task

    select max(table2.field) from table2

    and then use that variable in the source query.

    This way if connection broken in the middle of the load and you can restart the load by rerunning the job and load the missing records.

    [font="Arial"]BASKAR BV
    http://geekswithblogs.net/baskibv/Default.aspx
    In life, as in football, you won’t go far unless you know where the goalposts are.[/font]

  • Baskar B.V (7/26/2010)


    Is it also possible/necessary to adjust the max size of the transaction log? If yes, where can i do that?

    And will this dataflow task be able to resume in case the connection breaks or the data source pc is being restarted?

    You need to check your database properties if you want to increase or add more files to the transaction log.

    For restartability, what you can to do is identify a incremental field (like updt_dt, timestamp) or primary key in source table and then pull incremental records using a query something like below and pump that into your target table.

    Suppose if source table is table1 and target is table2 then the source query is something like below...

    select * from <table1> where <table1.field > select max(table2.field) from table2

    if both the databases are in different server, you can store the resultset of the query in a variable using sql-task

    select max(table2.field) from table2

    and then use that variable in the source query.

    This way if connection broken in the middle of the load and you can restart the load by rerunning the job and load the missing records.

    Unfortunately the pc had been shut down 🙁

    Both databases are on a different 'server', but I've no idea how to use sql task?

    Can't i just use a SSIS transformation to import the missing records, f.i. the Merge Join transformation?

  • When I'm using this query

    SELECT *

    FROM [SERVER_B].database.dbo.table

    EXCEPT

    SELECT *

    FROM database.dbo.table

    to select the right data, I'm getting "Xml data type is not supported in distributed queries". I know there is a workaround by casting the xml columns to varchar, but since there are 4 milion records with multiple xml columns I don't think this is smart....

    Any other suggestions?

  • if you have the flexibility of editing the ssis package if you need to restart the process.. then you can manually get the max value of the recordset that were loaded so far and hard code the value in the source sql query in the transformation.

    That way you don't need to write distributed queries. You can add data flow source and data flow destination and with the manually edited query in the data flow source you can load data without any convertion.

    [font="Arial"]BASKAR BV
    http://geekswithblogs.net/baskibv/Default.aspx
    In life, as in football, you won’t go far unless you know where the goalposts are.[/font]

  • Baskar B.V (7/27/2010)


    if you have the flexibility of editing the ssis package if you need to restart the process.. then you can manually get the max value of the recordset that were loaded so far and hard code the value in the source sql query in the transformation.

    That way you don't need to write distributed queries. You can add data flow source and data flow destination and with the manually edited query in the data flow source you can load data without any convertion.

    SSIS is also just using SQL under the hood right? Why don't I have to use distributed queries then in SSIS?

    But this afternoon I just reused the same data flow, but redirected the error output rows of the destination to nowhere. This way I was hoping that it would skip the records which are already present in the destination....So far it's running fine, 2 million rows left to import...

  • SSIS is also just using SQL under the hood right? Why don't I have to use distributed queries then in SSIS?

    You can use distributed queries but then you need to do the convertion of xml to varchar data types and you would also need to take into performance into consideration.

    But this afternoon I just reused the same data flow, but redirected the error output rows of the destination to nowhere. This way I was hoping that it would skip the records which are already present in the destination....So far it's running fine, 2 million rows left to import...

    Unless you have primary key defined on the target table you would not get any error and it would cause duplicate rows in your target table.

    [font="Arial"]BASKAR BV
    http://geekswithblogs.net/baskibv/Default.aspx
    In life, as in football, you won’t go far unless you know where the goalposts are.[/font]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply