ETL or ELT

  • ZZartin - Wednesday, April 4, 2018 11:39 AM

    Some people seem to get to hung up on having a very rigid structure of ETL or ELT where either everything is a straight copy from source to staging then all transforms are done to the target or everything is done in transfer process and loaded into the target without anything else being done.

    Where I'm working now it's fairly common to do a mixture of both, as well as putting views on top the source data and do some work there as well, so i guess that's doing transforms as part of the extract?

    Sums it up for me.  Young developers think they've found THE answer, old developers think they've found AN answer

  • RonKyle - Wednesday, April 4, 2018 5:34 AM

    It is not uncommon that the culprit is poorly designed transformation inside the ETL that can be done more efficiently in SQL Server.

    I do not understand why there seems to be this conception that "T" must take place outside of SQL Server.  All my "T" is done in a SQL Server staging database using set-based TSQL queries to format the data as required before it's "L'd" into the target tables.  T != RBAR.  And T as the middle step does not mean no SQL Server database.

    I'd add that I often am assigned to speed up ETL processes.  One common way I've found is by taking T's that happen after Ls and move them to a staging table so that the T can occur before the L.

    Yep, yep! Unfortunately many SSIS Developers I have worked with come from .NET and procedural programming backgrounds and are unaware of what SQL Server does well. Often I see a lot of string manipulation on data that could be more efficiently manipulated using an iTVF, persisted computed column, or, as you said, in staging tables. The "T" is much easier when you aren't extracting garbage.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Yep, yep! Unfortunately many SSIS Developers I have worked with come from .NET and procedural programming backgrounds and are unaware of what SQL Server does well.

    That at least explains why so many seem to think that not doing the T between the E and L avoids doing RBAR work.  I know what a cursor is, but I have yet to use one.  (Almost thought I had to once, but turned out I was wrong.)

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply