importing word document into sqlserver

  • chapter1

    Delegates officially are to nominate Cheney as the GOP's vice presidential candidate before he addresses the group Wednesday night.

    chapter2

    "I think that the vice president's speech tonight is going to be about big issues, the big issues of this campaign -- the war on global terror, the president's education policy, the fact that the economy is turning up again," she told CNN's "American Morning."

    chapter3

    She said she had known her husband since he was 14 and planned to share anecdotes that many people have not heard before in her introduction.

    Chapter4

    Maverick Democratic Sen. Zell Miller of Georgia is scheduled to deliver Wednesday's keynote address -- a role he also played at the 1992 Democratic National Convention, which nominated President Clinton.

    Chapter5

    In the earlier speech, Miller, then governor of Georgia, said that "for 12 dark years, the Republicans have dealt in cynicism and skepticism. They have mastered the art of division and diversion, and they have robbed us of our hope."

     

    I wanted to import the above word doc into my sql server DB.

    I have two columns in the table

    1.Chaptereid

    2.chapter_notes

    chater1,chater2,chapte2,chapter3,chapter4 and chapter5 should go into chaptered column and the text followed by the chapter id should be imported into chapter_notes column.

     

    So how can I import this document.

     

    Thanks.

  • I assume you're using Text for the notes and some type of character field for the chapter.

    If this is in Word, either write some parsing routine in VBScript/Vb.NET, C#, etc. or cut and paste it in.

  • Once you've parsed the single MS Word document into mutiple chapter-specifc MS Word documents (or text files), you can use Textcopy.exe to import either the MS Word or text file into either a column defined as TEXT or if you're using MS Word import it into a column defined as IMAGE. For demostration purposes, assum FTSTable is your table name and that chapter_notes is in this case ImageCol and you are importing MS Word into this column:

    use pubs

    go

    if exists (select * from sysobjects where id = object_id('FTSTable'))

      drop table FTSTable

    go

    CREATE TABLE FTSTable (

      KeyCol int IDENTITY (1,1) NOT NULL

        CONSTRAINT FTSTable_IDX PRIMARY KEY CLUSTERED,

      TextCol text NULL,

      ImageCol image NULL,

      ExtCol char(3) NULL, -- can be either sysname or char(3)

      TimeStampCol timestamp NULL

    ) ON [PRIMARY]

    go

    -- Insert data... (Note: Initalizing IMAGE column with 0xFFFFFFFF for use with TextCopy.exe)

    INSERT FTSTable values('Test TEXT Data for row 1', 0xFFFFFFFF, 'doc', NULL)

    go

    -- Select data

    SELECT * from FTSTable

    go

    declare @query varchar(200)

    -- Insert MS_Word.doc into Row 1

    -- NOTE: Ensure the correct path for textcopy.exe!!

    set @query = 'D:\MSSQL80\MSSQL$SQL80\Binn\textcopy /s '+@@servername+' /u <user_id> /p <password> /d pubs /t FTSTable /c ImageCol /f D:\SQLFiles\Shiloh\<MS_Word>.doc /i /k 5000 /w "where KeyCol=1"'

    print @query

    exec master..xp_cmdshell @query

    go

    -- Select data

    SELECT * from FTSTable

    go

    -- Full-text Enable the database, if not already done

    use pubs

    go

    -- do this only once

    exec sp_fulltext_database 'enable'

    go

    exec sp_fulltext_catalog 'FTSCatalog','create'

    exec sp_fulltext_table 'FTSTable','create','FTSCatalog','FTSTable_IDX'

    exec sp_fulltext_column 'FTSTable','ImageCol','add', 0x0409, 'ExtCol'

    exec sp_fulltext_column 'FTSTable','TextCol','add'

    exec sp_fulltext_table 'FTSTable', 'activate' 

    go

    -- Start FT Indexing...

    exec sp_fulltext_catalog 'FTSCatalog','start_full'

    go

    -- Wait for FT Indexing to complete and check NT/Win2K Application log for success/errors..

    -- Search for search_word_here in MS_Word file..

    select KeyCol, ImageCol  from FTSTable where contains(*,'<search_word_here>') order by KeyCol

    go

    -- Remove FT Indexes & Catalog & table..

    exec sp_fulltext_table 'FTSTable','drop'

    exec sp_fulltext_Catalog 'FTSCatalog','drop'

    drop table FTSTable

    The above method will also work if you import a text file into the TextCol and it allows you to search both on textual contents of both a text file as well as binary MS Word file stored in SQL Server.

    Thanks,


    John T. Kane

  • And one more question is that what would be the best data type to store the data of chapter_notes.And how can i have the same format in my database as in the word document.

     

    Thanks.

  • A. That would be the IMAGE (or sometimes referred to as a BLOB) datatype. Specificly, in the SQL code example above it would be column: ImageCol image. For Full-Text Search purposes, it will also be necessary to define a 'file extension' column (ExtCol char(3)) and populate it with 'doc' to represent the MS Word file extension of .doc. To display the MS Word document, you would need to use the MS Word plugin for IE or MS Word viewer that can be download for free from Microsoft.

    Regards,


    John T. Kane

  • Thanks for the info.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply