Bulk Insert Grief

  • I am trying to import a fixed col width input file, with 558 columns to a SQL Server database. (I refuse to hand type 558 columns into SSIS.)

    My first attempt was to build a SCHEMA.ini file and get the file from an attached directory of .txt files. That almost worked, but seems to choke on anything above 150 columns.

    Next attempt is to try BULK INSERT with a text .FMT file. Not much luck there either, so for a test I tried importing the entire file to a file with a single varchar(max) column without a .fmt file. That seems to work:

    create table ##C (

    [JUNK] varchar(max)

    )

    bulk insert ##C

    from '\\K021ch\New Input Files\T102D.txt'

    with (ROWTERMINATOR = '')

    (66533 row(s) affected)

    And when I check len(JUNK) I get 2016 chars in each of the 66533 rows imported.

    But when I try to duplicate this 1-column load using a format file, I get this error:

    bulk insert ##C

    from '\\K021ch\New Input Files\T102D.txt'

    with (ROWTERMINATOR = '\r'

    ,FORMATFILE = '\\K021ch\New Input Files\T102RawJunkCol.fmt')

    Msg 9422, Level 16, State 48, Line 1

    XML parsing: line 2, character 0, incorrect document syntax

    Here is my (what should be the leanest ever) .fmt file:

    9.0

    1

    1 SQLCHAR 0 2016 1 "\r" [JUNK] ""

    (The "\r" above is really backslash-r-backslash-n but gets corrupted by this formatter.)

    I think I've tried all possible combinations of fmt file and BULK INSERT parameters with no luck -- if anyone has a clue I'd be eternally grateful!

    (I have edited the path names above, but the full paths work.)

  • I don't know the answer to your problem but it does remind me of setting up the import of a few files using SSIS.

    I hope they add the ability to import the column settings to SSIS.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Jim,

    Two things, I think the format file you have is not right. You have

    9.0

    1

    1 SQLCHAR 0 2016 1 "\r" [JUNK] ""

    Should be

    9.0

    1

    1 SQLCHAR 0 2016 "\r" 1 [JUNK] ""

    Second, in your test you specified the rowterminator as "" but in the failing one the row terminator is defined as "\r". That will not cause the error you are getting but will likely not import the data (assuming there is no "\r" in the original file)

    Hope that helps.

    T

  • Thanks, Alvin, for the sympathy. I think it is a HUGE flaw the SSIS expects (and can only handle) hand typed column definitions. That combined with the FTP inability to work with IBM mainframes (where path names do not start with a "/") makes SSIS just about useless for working with flatfile mainframe oututs.

  • toddE: Thank you. You are right that the field separator comes before the table column number. (The "\r" as a field separator got lost by the editor here, I really have backslash-r-backslasn-n.)I fixed the table column number and tried again, but with the same result:

    bulk insert ##C

    from '\\K021ch\New Input Files\T102D.txt'

    with (FORMATFILE = '\\K021ch\New Input Files\T102RawJunkCol.fmt')

    Msg 9422, Level 16, State 48, Line 1

    XML parsing: line 2, character 0, incorrect document syntax

    .fmt file:

    9.0

    1

    1 SQLCHAR 0 2016 "\r" 1 [JUNK] ""

    Also just replaced the CR/LF with "", with no change in behavior.

    This silly thing is driving me to drink (at least that sounds like a good idea!)

  • You don't need a format file for this nor even BCP. Just do a BULK INSET and specify the FIELDTERMINATOR = ''.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Also, let me know if you need any help splitting that pig. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (11/18/2008)


    You don't need a format file for this nor even BCP. Just do a BULK INSET and specify the FIELDTERMINATOR = ''.

    Bringing in the entire file as a single VARCHAR(max) column without the format file is the one thing that does work. And I did that only as a first step to be sure that BULK INSERT would work in it's simplest form. Step 2 was to try a minimal one-long-field format file to duplicate that simple 1-column import. That step 2, with the 1-column import definition in a format file, is the part I have not been able to get to work.

    Step 3, if I can get step 2 to work, is to use a format file for all

    500+ fixed position, non-delimited fields which need to be parsed into as many columns. Actually, I have generated the format file with all columns defined, and the failure of that to work (step 0, I guess) is what led me back to step 1 to try to diagnose the problem.

    (After trying to use the Jet engine/ODBC linked ascii file approach, and finding that a SCHEMA.ini file chokes after 150-200 columns, I turned to BULK INSERT, although with the fear that I might hit a similar limit. I think one column should be doable; once I get that to work, I'll worry about a limit in the BULK INSERT .fmt file.)

    Also, let me know if you need any help splitting that pig.

    I sure do, any insights you can offer would be appreciated. (Just give me the address to which I should send my first born!)

  • Jim,

    I was able to get the following code to run ok. I am running SQL 2005 SP2.

    drop table ##c

    go

    create table ##C (

    [FLD1] varchar(5)

    ,[FLD2] varchar(6)

    ,[FLD3] varchar(7)

    )

    go

    bulk insert ##C

    from 'C:\Users\tedwards\documents\data.txt'

    with ( ROWTERMINATOR = '\r'

    --,FIELDTERMINATOR = ','

    ,FORMATFILE='C:\Users\tedwards\documents\data1.fmt')

    go

    select * from ##C

    go

    Format File

    9.0

    3

    1 SQLCHAR 0 5 "" 1 [FLD1]SQL_Latin1_General_CP1_CI_AS

    2 SQLCHAR 0 6 "" 2 [FLD2] SQL_Latin1_General_CP1_CI_AS

    3 SQLCHAR 0 7 "\r" 3 [FLD3] SQL_Latin1_General_CP1_CI_AS

    Data File

    Test1TEST11TEST11x

    Test2TEST22TEST22x

    Test3TEST33TEST33x

    Not sure this is going to help any. I could not get it to fail with the XML parsing error at all, I got other errors as I was testing but nothing like what you got. I did notice that it seemed to have traouble with just one column. Bu that was processing/truncation issue and not the XML error.

    Hpe this helps

    Todd

  • ToddE (11/19/2008)


    Jim,

    with ( ROWTERMINATOR = '\r'

    3 SQLCHAR 0 7 "\r" 3 [FLD3]

    Todd, thanks. When I posted, I noticed that "backslash-n" was getting lost in this editor. In your two references to row/last field terminator above, did you use backslash-r only, or was it backslash-r-backslash-n when you ran it.

    I tried, without success, to mimic your format file in mine. Since that failed, I am going to try to use your statements in their entirety to see if it is me, or my system. (Anyone aware of any recent fixes to BULK INSERT?)

    Thanks very much for your help and time!

  • Jim,

    They were backslash-r, backslash-n in the format file and the bulk insert.

    Todd

  • Wonderful!

    I first just used \r as it appeared in your posting, and got at least the first row loaded, with messages that the second and third rows failed. But after inserting the (lost) backslash-n in the two places where they were needed, I duplicated your results.

    I still have no clue why my import fails, but, filled with new hope and inspiration, I will look further for any differences that might explain my failure.

    Thanks so much!

  • Jim,

    So my code worked for you? If so, just substitute your files in place of mine and see what happens.

    Also, may be completely unrelated, but I have had problems with flat files that are saved as Unicode format. They look ok in an editor like notepad, but there are extra characters that mess things up.

    Todd

  • Jim Russell (11/19/2008)


    Jeff Moden (11/18/2008)


    You don't need a format file for this nor even BCP. Just do a BULK INSET and specify the FIELDTERMINATOR = ''.

    Bringing in the entire file as a single VARCHAR(max) column without the format file is the one thing that does work. And I did that only as a first step to be sure that BULK INSERT would work in it's simplest form. Step 2 was to try a minimal one-long-field format file to duplicate that simple 1-column import. That step 2, with the 1-column import definition in a format file, is the part I have not been able to get to work.

    Step 3, if I can get step 2 to work, is to use a format file for all

    500+ fixed position, non-delimited fields which need to be parsed into as many columns. Actually, I have generated the format file with all columns defined, and the failure of that to work (step 0, I guess) is what led me back to step 1 to try to diagnose the problem.

    (After trying to use the Jet engine/ODBC linked ascii file approach, and finding that a SCHEMA.ini file chokes after 150-200 columns, I turned to BULK INSERT, although with the fear that I might hit a similar limit. I think one column should be doable; once I get that to work, I'll worry about a limit in the BULK INSERT .fmt file.)

    Also, let me know if you need any help splitting that pig.

    I sure do, any insights you can offer would be appreciated. (Just give me the address to which I should send my first born!)

    Any chance of you attaching the first 10 or 20 lines from the file in question... I've got an awesome idea that'll make all this go away... if the file looks right. It'll also help me test. Of course, nothing personal or proprietary in the file, please.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • The file does contain personnel information, so I can't simply send a few lines. If necessary, and if my current path of attack does not pan out, I'll scramble it somehow.

    I think I have decided that the error message I get:

    Msg 9422, Level 16, State 48, Line 1

    XML parsing: line 2, character 0, incorrect document syntax

    is talking about the data file, instead of the .fmt file as I first assumed. If the trouble is with the first character of the second line that makes me want to investigate the row length and row terminator further. (It's an error that Todd didn't get, and his sample file works for me, so it is likely my data file.)

    The data file comes from a FTP download of a mainframe flat file, and perhaps just \r or is being used. Notepad does see the line breaks, so maybe not, but I'll have to inspect the data closer with VBA or something and see if I get any more hints. (Where is a Hex/ascii file viewer when I need one?)

    Thanks, Jeff (and Todd), for sticking with me on this.

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply