BULK DATA PROCESSING

  • Hi Friends,

    I have a query that links two huge tables (44+millions vs 65+million records) and two small tables. I want to process the data for the query dividing into subunits like 1st 1 million records and the 2nd million like that. The query will run fine if we specify the limit like top 500000. But cannot perform the whole result since it exceeds the size of tempDB (10 GB). Is there any other solutions other than increasing the size of tempDB?? Is there any way to specify the buffer size for the resultset. I just want to run it once only (migrating to DB2).

    FYI : Query is well tuned

    Your help is appreciated.

    Thanks,

    Ragesh Chavarattil

  • Assuming your two big tables are being joined by an integer key, you could try something like this:

    declare @n1 int, @n2 int

    select @n1 = min(pk), @n2 = max(pk) from bigtable1

    while @n1 <= @n2 begin

       insert into ....

       select ....

       from bigtable1 b1

       inner join bigtable2 b2 on b1.pk = b2.pk

       join ...

       where b1.pk between @n1 and @n1 + 999999

         and b2.pk between @n1 and @n1 + 999999

       set @n1 = @n1 + 1000000

    end

    If the key is non-numeric, it could be:

    declare @n1 char(15), @n2 char(15), @n3 char(15)

    select @n1 = min(pk), @n2 = max(pk) from bigtable1

    while @n1 <= @n2 begin

       select @n3 = max(pk) from (

          select top 1000000 pk from bigtable1

          where pk >= @n1 order by pk) x

       insert into ....

       select ....

       from bigtable1 b1

       inner join bigtable2 b2 on b1.pk = b2.pk

       join ...

       where b1.pk between @n1 and @n3

         and b2.pk between @n1 and @n3

       if @n2 = @n3   break

       select @n1 = top 1 pk from bigtable1

       where pk > @n3 order by pk

    end

     

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply