Clean database data, make a new database - Best practices??

  • We have a web app with the data in a SQL Server.  There are several organizations using the web app so the database contains all the data of all the organizations.
    Now the biggest organization ask to host the app on their own servers.  We need to backup the database and restore it in the server, but we need to erase the data belonging to other organizations first.  It a big database, about 200 gigs.  The big organization owns about 40 gigs of the 200 total gigs of the database.  We did clean the database but the size is still 200gigs and the log file is very big.   So what is the good practices here?  how can we reduce the size of the database to its real value,   and what about the index?

  • Hi

    When you said that β€œthe log file is very big”, I could imagine that bulk of the database size (sp_spaceused) is due to huge transaction log file.

    To decrease the size of transaction log file, perform a tlog backup then shrink it. Below is a link that shows how. Make sure you do a transaction log backup first.

    https://www.mssqltips.com/sqlservertutorial/3311/how-to-shrink-the-transaction-log/

    With regards to data files, I suggest not to shrink them. It will cause massive index fragmentation and it is ugly. (Refer to the link below for more details)

    https://www.brentozar.com/archive/2009/08/stop-shrinking-your-database-files-seriously-now/

    Just leave free space in a SQL Server data files.

    I hope this helps.

    Best Regards,
    Ricky

  • As this is a once off task the "recomendation" not to shrink the files does not apply.
    And as this is to go to a new database I assume that you are restoring the current database to a new one and then performing the "erases".

    In order to reduce the size of both of log and data files the command is the same for both - dbcc shrinkfile (datafile, size)

    Now the fact that you mentioned that the log file is too big - is it normally big or did it grew when you deleted the data that does not belong to the organization that is moving to their own server?
    If it grew as part of the deletes this could be due to having the database on full recovery mode at that point - again as this is a once off I would turn recovery to simple before doing the deletes.

    After doing the shrinks run rebuilds on all tables/indexes

    One alter native to the above which may even be faster and would not cause the growth is to do as follows

    create empty database with simple recovery mode
    create all required tables with no indexes and no foreign key constraints
    for each table used by the target organization
    loop

    copy data accross to new database - SSIS/BCP/C# with sqlbulkcopy - with option to keep identity values
    recreate required indexes

    end loop
    add all foreign key constraints
    change database to full recovery (if this is how you need it set)

    Note that the copy of the data could be done in parallel if your server has enough power for it.

  • frederico_fonseca - Sunday, November 12, 2017 2:16 AM

    As this is a once off task the "recomendation" not to shrink the files does not apply.

    It does apply regardless of whether it's one time or not. Recommend that you push the user data into a new file group, release the unused space and then push it back into the original file group.
    😎

  • frederico_fonseca - Sunday, November 12, 2017 2:16 AM

    create empty database with simple recovery mode
    create all required tables with no indexes and no foreign key constraints
    for each table used by the target organization
    loop

    copy data accross to new database - SSIS/BCP/C# with sqlbulkcopy - with option to keep identity values
    recreate required indexes

    end loop
    add all foreign key constraints
    change database to full recovery (if this is how you need it set)

    Note that the copy of the data could be done in parallel if your server has enough power for it.

    Thanks I like the avenue of create an empty database and copy data.   What would be the easiest/cleanest way to do the copy?

  • Eirikur Eiriksson - Sunday, November 12, 2017 3:13 AM

    Recommend that you push the user data into a new file group, release the unused space and then push it back into the original file group.
    😎

    Can you tell us more about this option.  How to push the data into a new file group and release space?   When should I delete the unwanted data?

  • dubem1-878067 - Sunday, November 12, 2017 6:43 AM

    Eirikur Eiriksson - Sunday, November 12, 2017 3:13 AM

    Recommend that you push the user data into a new file group, release the unused space and then push it back into the original file group.
    😎

    Can you tell us more about this option.  How to push the data into a new file group and release space?   When should I delete the unwanted data?

    This is very straight forward, add a new file group to the database, clean up the data and then move the tables to there and drop any indices. That should leave the Primary filegroup with all the system data and the new filegroup with the user data. Now you can release the unused space in the Primary and which other file groups you originally had. Then you have either the option of leaving it as is or moving the user data back into which ever file group it originally resided in.
    😎
    There are some articles out there on the subject, should not be hard to find them, busy working so I don't have time to write detailed instructions.

  • Move empty file will move data blocks as they are onto a new file - the main mdf file will still remain as is with system objects - the shrink may even throw error "Cannot move all contents of file to other places to complete the emptyfile operation."

    So if after you do all your deletes you do the shrink empty file you will still need to do indexes rebuilds in any case, specially if you have some tables that are heaps.

    If you move the data back to the main file (after shrinking it) then you are moving the data twice

    doing the empty file will also mean that you need the extra space on the filesystem - enough to hold the 40GB of data you say that organization has on the main database.

    normal shrink + index rebuild is the easiest and probably the fastest to do.
    shrink empty + index rebuild next - but I don't advise it as per above.

    new database + copy data is slower to setup if you don't already have the utilities and knowledge to do it.

    Following script could eventually be used for it - but does not copy foreign keys (which is good while copy is being done) https://sqljana.wordpress.com/2017/06/12/powershell-copy-sql-server-tables-structure-data-indexes-to-another-databaseinstance/. Note that this one does create table, copy data, create indexes which is the desired method.

    for foreign keys the following script could be run https://www.mssqltips.com/sqlservertip/3347/drop-and-recreate-all-foreign-key-constraints-in-sql-server/ - remove the execs at the bottom - run it on the main database, copy the contents of the print @create and execute on the new database.

    Also and because it is only 40GB of data it may be acceptable to use normal insert into ... for each table - your tlog will grow a bit while doing it but it can be shrink with no issues after the full process is finished. - in this case make sure you do not have the foreign keys on the final tables and only recreate them at the end.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply