Performance issues with sql 2005 for ETL flat file size around 1gb

  • Hi,

    We are in the process of upgrading to sql 2005 from sql 2000. I have read lot of posts which talk about SSIS processing done on memory for the entire pipeline and performance issues when the entire pipeline can't be handled with in the memory.

    Is there any correlation between available memory and what the file size can be. We are talking about a file 1gb in size containing about 2M records. Within the same pipeline, if I split the files into smaller size and use foreach loop to process all the smaller files, does it matter as far as memory use is concerned. There is a potential for the file size to grow to double the size in the near future so I want to make sure even if it is not a problem today, later down the road, it doesn't become an issue.

    Thanks.

    Peter Kennedy

  • peterk1961 (3/16/2009)


    Hi,

    We are in the process of upgrading to sql 2005 from sql 2000. I have read lot of posts which talk about SSIS processing done on memory for the entire pipeline and performance issues when the entire pipeline can't be handled with in the memory.

    Is there any correlation between available memory and what the file size can be. We are talking about a file 1gb in size containing about 2M records. Within the same pipeline, if I split the files into smaller size and use foreach loop to process all the smaller files, does it matter as far as memory use is concerned. There is a potential for the file size to grow to double the size in the near future so I want to make sure even if it is not a problem today, later down the road, it doesn't become an issue.

    Thanks.

    Peter Kennedy

    Peter,

    The SSIS pipeline is designed to process the data in blocks of data. The block size is configurable and the idea is to NOT have all the data in memory. You should not split your files in smaller pieces because the pipeline itself is doing this by itself 😉 There are components like the sort component or the lookup component, which require certain data to stay in memory. But if designed properly this shouldn't be a problem.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Hi,

    I have a flat file, it size is more that 2.5 Gb. The file is not formatted appropriately and I have to read in one column [text stream] for future parsing. When file is smaller, I have no problem, all process works good. But when it comes more that 2 Gb there are performance issues. The Flat File Source reads it in very small blocks and process runs too long (45-60 min).

    Is there any possibilities for performance tuning? What properties should be changed to have it runs faster (buffer size e.t.c.)?

  • volkorna (3/20/2009)


    Hi,

    I have a flat file, it size is more that 2.5 Gb. The file is not formatted appropriately and I have to read in one column [text stream] for future parsing. When file is smaller, I have no problem, all process works good. But when it comes more that 2 Gb there are performance issues. The Flat File Source reads it in very small blocks and process runs too long (45-60 min).

    Is there any possibilities for performance tuning? What properties should be changed to have it runs faster (buffer size e.t.c.)?

    You should have created a new post because it is not related to the original question.. only similar. What is the format of the data? Please provide more extensive example of the input data and how it has to be processed.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply