Importing a CSV file to SQL server

  • I am trying to figure out how to import a CSV file to a SQL server and I am not having any luck. I am trying to use the DTS wizard, but it does not work.  I am not sure which sources to select. I have the CSV file on my computer, and when I select it I get errors.  Can anyone help me through this process.  Any help is appreciated.

    Thanks,        

    Nick


  • Nick,

    Select type Text. Not Microsoft Text-Treber, but just Text, it is almost at the bottom of the list, The File Name textbox will let you then to select your csv file, click Next,Select the Format screen will come up, select Delimited, specify the Row Delimiter, it will by default be CR/LF which is fine. Click Next. Specify Column Delimiter, by defaut I had Comma, which is what you need. Click Next, provide your authentication and DB to import to. Click Next, Click on the Transform button and check the fields, you may change names here, as well as you can change the table name next to the Transform button in Destination field before clicking Transform. Click OK or Cancel to exit Column Mappings and Transformation screen if you did click Transform button. Click Next on Select Source Tables and Views screen. Click Run Immediately or schedule the package.  Next. Verify settings on the next screen. Click Finish. Watch it running.

    Let us know if you need more help.

    Yelena

    Regards,Yelena Varsha

  • That works great, but I have one problem.  The data in the CSV file will be getting updated periodically and I will set it up so the import is ran on a schedule.  How do I make it where the data will not be written twice.  When I import now it runs fine the first time, and when I run the import again, the data is put in the table twice.  I want just the updated data to be changed in the table and everything else which has not been updated to go unchanged.  Thank you very much for your response 

  • Hi,

    It is not clear what you would like: to add an updated row as a new record or update the existing record. I also don't know if the file contains all your information or just incrementals. The following could be used:

    1. If the file contains all your info, DTS Wizard in the Transform screen has check boxes for dropping and re-creating a table or for truncating or appending data

    2. You may want to pre-process  a file before it gets imported by comparing the new and the old file by the script and just importing the difference

    3. The DTS package can include steps that will run a script or a stored procedure that will do that.

    4. Import the whole thing in the temp table, then run a simple update statement as a step to the DTS package. Something like that:

    Insert  Mytable

    Select temptable.Field1, temptable.Field2 from temptable

    join Mytable on  temptable.Field3 = Mytable.Field3

    where <put here your conditions what is the updated row means in your case)

    You can use outer joins to get rows that exist in temptable but don't exist in your primary Mytable.

    Yelena

    Regards,Yelena Varsha

  • I tried the drop table and making a new table, but the CSV file will not have all of the data in it.  I will try to explain better.  The CSV file will only have updated data or new data in it.  In the CSV file are names of individuals and their addresses.  The CSV file will either consist updated addresses, or be a complete new entry (name and address).

    How would I pre-process the file?  I am not for sure I understand what you are talking about.

    Thanks again for your help

  • Nick,

    You may pre-process if you have yesterday's file and something changed in it today, so you can parse text and find out what changed. How you just explained, it is not the case here. You do need to do it in DTS package. I would really import in the temp table. Steps for the job or a package:

    1. Truncate Table temptable from yesterday's data

    2. Import data from cs file to temptable

    3. You can just run a stored procedure there that you create in advance. Compare the key field for each record in the temptable with what is in your primary table. I would create a cursor for all rows in temptable. Open Cursor. Move to the first line. Read the key field to a variable @CurrentKey for example, and then do that:

    If exist (SELECT MyTable.MyKeyField from MyTable where MyTable.MyKeyField = @CurrentKey)

    Begin

    Update MyTable (do an update for your row here from the temp table)

    End 

    Else

    Begin

    Insert MyTable (do an insert for this row)

    End

    Regards,Yelena Varsha

  • I will try this and we will see what happens.  I will probably have some questions for you tomorrow.  Thank you very much for your help.

    Nick

  • Nick,

    I am taking a day off tomorrow but may login from home once or twice. I will be working Friday morning and then we have Monday off. I will try to monitor this topic anyway and please, do post your questions.

    I was working with someone today on the exactly the opposite problem: exporting to csv file.

    What is your database for? Is it like a Contact Management System because you say names of individuals and addresses? The database app I mentioned is exactly the one. The name is GoldMine and it is a third-party product that many companies are using. If by chance you are talking about GoldMine then it does have another third-party data import plug-in that can do what you need.

    Yelena

     

    Regards,Yelena Varsha

  • Yelena,

    The database is for a card access system I am working on. I am not familiar with GoldMine.  Can you step me through doing it with the DTS.  I am fairly new to this and I am not sure what to do for what you said above.  The data in the CSV file with have the Last Name, First Name, Title, Phone Number, Extension, and Address.  Each have their own column.  The last name and first name need to be used as the primary key.  If the name changes, then it will be a new row.  However if the title, phone number, extension, or address change, I need to replace the old data with the updated data.  I think doing it with the temp table is the way to do it, but I am not sure how to go about doing it.  I see the steps you listed above, but some of them I dont understand.  What do you mean when you say truncate the temp table, and why/how would I create a cursor for all rows in the temptable.  That is where I get lost. In step 3 of the above post.  Thanks again for your time and help.

    Nick

  • Nick,

    Cursors are like recordsets where you are at a particular record, can analyze it and move to the next record. This is different from default SQL Result set that comes as a set. Please, look up at Books Online. I am at the computer now that I just re-installed and don't have SQL Server installed on it. But the topic to see is "DECLARE CURSOR" and related topics. But... If you did not work with them before, do not do it. since I understand there is only 2 types of records in CSV file, new and updated, you can go without cursors.

    Do that: Import data in the temptable. Then run not one but two independent statements. The first one will select rows where First and last names match and update those records in the main table. The second statement will select rows where names do not exist in the main table and insert those records. You may be able to combine them in one batch using CASE function (statement) but if you did not work with CASE, try to keep it simple.

    About truncating: see "TRUNCATE TABLE" statement in Books Online. You do not need to use it if your temp table will be a real SQL Server temporary table with the name starting with # like #mytemptable because it will be dropped right after the connection is dropped. You need to truncate (delete all data after the processing) only if yor so-called temp table is a normal table, just used for temp data.

    Hope to be able to help more when I am in the office Tuesday. What are your timelines?

    Yelena

    Regards,Yelena Varsha

  • Yelena,

    The timeline for the project is to have something importing correctly by the end of next week Jan 14th.  I think I am getting closer, but I cant get the first statement to work.  Can you give me an example of a statement you would use?  Mine just keeps inserting the entire file again, so as a result I have two copies of the data.  Thank you for your help. 

  • My application's db's data comes from daily flat file images of Ocean shipping data. At the end of a lot of trial and error, this is the process I use to import the file and then integrate the data into the production tables

     

    http://www.vinnychi.com/vince/IBFF_ImportProcess.htm

     

    It gets messy towards the end (the diagram does) but by then i didn't care anymore.

  • Nick,

    Here is what you have to try. For example I created 2 tables:

    BIGTABLE (FisrName,LastName,CustData) - this is a prototype of your production table

    TEMPDATA (FN,LN,CD) - this is a prototype of your table containing daily data from file

    Your TEMDATA table will contain daily import and will be truncated right before the import. I used some sample data from PUBS to populate both tables, see scripts below.

    Your script will first run: TRUNCATE TABLE TEMPDATA

    Then the line that invokes your DTS job to populate TEMPDATA from the CSV file:

    EXEC sp_start_job 'JobNameGoesHere'

    The you have to wait for the job to finish importing data to the TEMPDATA. After that you may insert and update the main BIGTABLE. You may even implement it as 2 separate job steps. The first one truncates TEMPDATA and starts the job. The second one that will run "On Success" of the first one will insert and update.

    ******DISCLAMER: Code below is just a sample, is incomplete and barely tested on one example*********

    You will run 2 statements similar to those for the second step. The first one will look up and insert new data and the second one will update the existing customer. Please, do not forget to use qualifiers if the field names are the same and check ANSI setiings if having trouble to select data where something is NULL.

    Insert BigTable

    Select FN,LN,CD from TempData

    Left Outer Join BigTable

    ON FirstName = FN and Lastname = LN

    Where Firstname IS Null and LastName IS Null

    Update BigTable

    Set CustData = CD

    FROM BigTable Join TempData

    ON  FirstName = FN

    AND LastName = LN

    Where  CustData <> CD

    The following script (only for development environment)  will populate the sample tables with Data from Pubs and modify some rows in TempDat so you would have something to play with:

    Insert BigTable (Firstname,Lastname,CustData)

    Select top 10 au_fname,au_lname,city from pubs.dbo.authors

    Insert TempData(FN,LN,CD)

    Select top 4 * from BigTable

    Update TempData set CD = 'Boston'

    where CD ='Berkeley'

    Insert TempData (FN,LN,CD)

     Values('New','Customer','Portland')

    Yelena

    Regards,Yelena Varsha

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply