SSIS Lookup and caching

  • Hi,

    Wondering if anyone can give me some guidance on this one. I have a source table with approx 1 million rows in it and I need to perform lookups onto 6 reference tables that each hold fewer than 200 rows. All but one of the reference tables have start/end date columns so I need to perform the lookup based on a source column matching a reference column and a source column (a date) being between the start and end date.

    The way I achieved this was to link the source lookup date to the start column in the reference table and then change the caching SQL on the advanced tab so that it ensured that the source date was between the start and end date rather than equal to the start date.

    I have another lookup table that does a straight lookup based on matching a single column value and the progress list shows that this lookup caches rows but the others don't, even though I don't have caching switched on for it.

    So, it looks as if something I've done has prevented SSIS from caching the small reference tables. Is it because I'm trying to do a lookup based on BETWEEN instead of straight joins?

    Anyone have any ideas?

    Thanks.

  • I think Jamie Thompson touched on something similar to this @ his Blog Site - http://blogs.conchango.com/jamiethomson/default.aspx. I think he discussed the idea of getting the data you need to lookup and writing it into a RAW file. Then use that RAW file for your lookups and it will be semi-cached. IIRC, performance was pretty decent and in some cases was actually better than the defaults of doing the lookup and keeping things in memory. May be worth checking out.

    -Pete

  • Thanks for that, I'll look into it. I've rewritten the package to use set based operations and the 90 minutes has now come down to 10, which the business are happy about.

    Ian

  • Just had a chance to look into this but I can't see how to use a RAW file as the source for a lookup. Am I missing something obvious?

    Ian

  • Honestly don't know as I haven't had a chance to use it. If RAW isn't working, perhaps just a flat file would work. I just vaguely remember Jamie (or someone similar) writing on it at their blog site. I also remember being somewhat dissatisfied with caching of data in a loop. I had a table that I needed to cache for a lookup, but every iteration of the loop would pull all of the data again. A Flat File/RAW file was suggested as something that would read the data once, then loop through the file much faster to get the data than going back to the database.

    -Pete

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply