The Glue that Binds

  • cs_troyk (9/22/2008)


    In addition to the automatic parameterization issues mentioned earlier, I have seen cases where the SQL emitted by a LINQ expression contains a cartesian product (CROSS JOIN) where it's not necessary. Consider the implications of running something like that against your order history table.

    Can you show an example of this? That would certainly be a major bug... I've seen a CROSS JOIN used, but the join predicate was specified in the WHERE clause. In that case the CROSS JOIN is optimized as an INNER JOIN. More concerning to me is that I've seen OUTER APPLY used in a few cases. Unlike a filtered CROSS JOIN, OUTER APPLY forces a specific optimization (nested loops), so that is certainly an area I'll be watching as I continue to explore the ins and outs of this technology.

    --
    Adam Machanic
    whoisactive

  • cs_troyk (9/22/2008)


    In addition to the automatic parameterization issues mentioned earlier, I have seen cases where the SQL emitted by a LINQ expression contains a cartesian product (CROSS JOIN) where it's not necessary. Consider the implications of running something like that against your order history table.

    I'm also not sure if someone has mentioned the direct table access (i.e., security) issue yet, but it's there.

    In my opinion, LINQ to SQL is not yet ready for prime time, at least as far as enterprise apps go. You could probably get away with using it for a small "utility" app or something, though. Now, using LINQ to derive information from in-memory programming objects is pretty cool. A nice way to take a step toward declarative programming in the app.

    And one last thing -- it seems that time and time again, MS comes out with the "next great thing" for improving data access. I've seen it in just about every rev of their development environment going all the way back to VB for DOS. These features demo really well at dev conferences, but never seem to gain the wide adoption that's expected. It will be interesting to see if LINQ is different in this regard.

    TroyK

    Agreed. Still - my biggest objection to it is the separation of duties issue. Since we're back to putting what essentially boils down to dynamic SQL in the presentation layer, any time something starts misbehaving, the DBA now needs to go review the developer's code to figure out why this is creating bad SQL calls. And that's assuming it's even possible to tweak the LINQ call enough to put it back on the right track. It screws with the separation of layers, screws up who needs to be reviewing what, etc... In short - it has a tendency to torpedo your existing processes, which doesn't equate to time savings in my book.

    And what about portability you say? Well - it's kind of nifty to have cod ethat performs over any number of engines/storage mechanisms. But then again, if I wanted to create code that operates equally poorly in any number of RDBMS engines - I could go use ANSI standard SQL calls......:)

    Being able to code something quickly is awesome. Really - it is. But the whole point of good code is that you shouldn't have to be developing it continuously/messing with it. So - the performance of the code once you're DONE developing it is way more important than how it behaves while you're coding. After all - if you keep having to screw with your code, you aren't doing it right.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (9/22/2008)


    Agreed. Still - my biggest objection to it is the separation of duties issue. Since we're back to putting what essentially boils down to dynamic SQL in the presentation layer ...

    Why would you put this in the presentation layer? For me the best practice will stay the same: Treat a LINQ to SQL-backed DAL just like any other DAL. Encapsulate it, and expose business objects, not data, to the outer layers.

    --
    Adam Machanic
    whoisactive

  • cs_troyk (9/22/2008)


    In addition to the automatic parameterization issues mentioned earlier, I have seen cases where the SQL emitted by a LINQ expression contains a cartesian product (CROSS JOIN) where it's not necessary. Consider the implications of running something like that against your order history table.

    I'm also not sure if someone has mentioned the direct table access (i.e., security) issue yet, but it's there.

    In my opinion, LINQ to SQL is not yet ready for prime time, at least as far as enterprise apps go. You could probably get away with using it for a small "utility" app or something, though. Now, using LINQ to derive information from in-memory programming objects is pretty cool. A nice way to take a step toward declarative programming in the app.

    And one last thing -- it seems that time and time again, MS comes out with the "next great thing" for improving data access. I've seen it in just about every rev of their development environment going all the way back to VB for DOS. These features demo really well at dev conferences, but never seem to gain the wide adoption that's expected. It will be interesting to see if LINQ is different in this regard.

    TroyK

    I pretty much agree with everything you've said based on my experience with Linq. I haven't run into the cross join issue you saw because I've limited my usage of Linq in directly accessing tables -- like you mentioned, security issue, I'd rather use stored procedures to control access. The one time I have used it to access tables I noticed it was converting my tinyint and smallint parameters to int in the where clause. I have no idea why since the base columns it's comparing against are defined as tinyint and smallint, and my Linq .dbml file has the columns defined as byte and short.

    But yeah, once you get the data over -- via stored procedures, heh -- Linq shines! Before Linq I would often join results to lookup tables on the database side in order to return descriptions of things because doing the join on the client side was too much hassle. Now doing the joins with Linq is easy, and as we all know programmers take the path of least resistance. 😛

    To be proper the aspect of Linq I like is LINQ to Objects, e.g. joining in-memory objects, versus LINQ to SQL, querying your database.

  • Adam Machanic (9/22/2008)


    Matt Miller (9/22/2008)


    Agreed. Still - my biggest objection to it is the separation of duties issue. Since we're back to putting what essentially boils down to dynamic SQL in the presentation layer ...

    Why would you put this in the presentation layer? For me the best practice will stay the same: Treat a LINQ to SQL-backed DAL just like any other DAL. Encapsulate it, and expose business objects, not data, to the outer layers.

    True - but it's funny that not a single "officially" provided example actually sets it up that way. The LINQ calls just get inlined within whatever functionality you're building, and there is no separation of layers (all right - I should have actually said Business layer instead of presentation). I guess I'm onto ranting about how LINQ is being presented/marketed, but from the presentations I've seen, it's just another way to roll out quickly large quantities of sub-par code.

    If you're going to encapsulate your data calls in a "true" DAL - what advantage do you get out of LINQ? Why not finish encapsulating the calls the way that they were intended to be encapsulated, and create the stored procs to do what you need done? How is that extra overhead that remains long after the dev cycle worth having in a production environment?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Adam Machanic (9/22/2008)


    cs_troyk (9/22/2008)


    In addition to the automatic parameterization issues mentioned earlier, I have seen cases where the SQL emitted by a LINQ expression contains a cartesian product (CROSS JOIN) where it's not necessary. Consider the implications of running something like that against your order history table.

    Can you show an example of this? That would certainly be a major bug... I've seen a CROSS JOIN used, but the join predicate was specified in the WHERE clause. In that case the CROSS JOIN is optimized as an INNER JOIN. More concerning to me is that I've seen OUTER APPLY used in a few cases. Unlike a filtered CROSS JOIN, OUTER APPLY forces a specific optimization (nested loops), so that is certainly an area I'll be watching as I continue to explore the ins and outs of this technology.

    Hi Adam;

    I had a repro of this issue laying around somewhere, but I can't put my hands on it right now. I have communicated with Steve (Jones) about it, so let me shoot him a quick message and see if he has the code handy. If not, it should be pretty simple for me to redo it.

    Once I have something up and running (hopefully this week sometime), I'll post it on my website at 42and5.com.

    TroyK

  • Matt Miller (9/22/2008)


    If you're going to encapsulate your data calls in a "true" DAL - what advantage do you get out of LINQ? Why not finish encapsulating the calls the way that they were intended to be encapsulated, and create the stored procs to do what you need done? How is that extra overhead that remains long after the dev cycle worth having in a production environment?

    I'm not sure which extra overhead you're referring to; LINQ to SQL shouldn't have much if the solution is properly defined and if you make proper use of compiled queries, etc. So disregarding that piece, as a straight comparison between LINQ to SQL and a "standard" ADO.NET DAL, what is the benefit? That's a difficult question for me to answer, but I can think of the following arguments:

    A) The main argument from the MS camp seems to be that LINQ to SQL should increase developer productivity, because apparently developers don't like writing SQL, and apparently they will like writing LINQ to SQL queries a lot more, and will therefore be better at doing so. This assertion I have a major problem with. First of all, I haven't yet met a developer who really didn't like writing SQL. I'm sure they're out there, but I've consulted for about a dozen companies over the past two years and I haven't found one yet. The developers may not be as good at writing SQL as a DBA, but they often don't even realize that they're not good at it. LINQ to SQL, in my opinion, has just as sharp a learning curve as SQL--you can learn the basics in a few hours, but true mastery will take years. So I don't think that this argument applies at all.

    B) LINQ to SQL should increase developer productivity, because it handles type conversions for you, provides better tools support, and does compile-time checking. OK, I'll buy that.

    C) LINQ to SQL will make your application somewhat more secure, if you're using ad hoc SQL already and you convert it. This is due to the fact that all queries are parameterized. OK, that's fair enough, but it would be much quicker and much easier just to parameterize the queries than to convert an entire app over to use LINQ to SQL.

    D) LINQ to SQL will eliminate a lot of "bad" SQL. I think this is a fairly strong argument, actually -- think of how many developer types overuse cursors and temp tables. With LINQ to SQL this will no longer be as big an issue -- but I won't say no longer an issue at all, because if they do something like pull down an entire 20 million row table just to loop and find one row, that's just as bad. Plus, correctly writing complex LINQ to SQL queries is just as difficult as correctly writing complex SQL queries, so in the end this may be a wash. But we'll have to wait and see on that one.

    So in the end, the only argument that really holds water is the compile time checking and type conversion. And that certainly is a promising feature, in my opinion. Now if we could just get stored procedures that exposed well-defined output contracts, the sweet spot would be clear: LINQ to SQL to stored procedures.

    --
    Adam Machanic
    whoisactive

  • Adam Machanic (9/22/2008)


    I'm not sure which extra overhead you're referring to; LINQ to SQL shouldn't have much if the solution is properly defined and if you make proper use of compiled queries, etc.

    The overhead I'm talking about is comparing LINQ-enabled perf vs not. From what I can tell - there's a non-negligible perf penalty, just by adding in LINQ. Even the MS presenters who demoed it around my area were saying "LINQ should be 80% as performant as straight SQL solutions, if implemented correctly". The penalty I observed wasn't that high, but I suspect they had a fancier way of checking this than I had.

    At the end of the day - I don't disagree that LINQ has some serious potential. I just think it's a. not fully mature yet, and b. that it's being misrepresented (as far as what it can do, and who should be using it). The whole "this is for people who don't like to write SQL, and who don't much care about the data calls" comment that kept popping up is just like blood in the water in most Dev/DBA shops I know of....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (9/22/2008)


    The overhead I'm talking about is comparing LINQ-enabled perf vs not. From what I can tell - there's a non-negligible perf penalty, just by adding in LINQ. Even the MS presenters who demoed it around my area were saying "LINQ should be 80% as performant as straight SQL solutions, if implemented correctly". The penalty I observed wasn't that high, but I suspect they had a fancier way of checking this than I had.

    Rico Mariani says that it performs 93% as well as a straight SqlDataReader.

    http://blogs.msdn.com/ricom/archive/2007/07/16/dlinq-linq-to-sql-performance-part-5.aspx

    That's not much overhead, in my opinion. And given that a lot of people misuse DataSets and end up totally destroying their data access performance, it could even be seen as a performance benefit. So in the worst of cases, 93% of a SqlDataReader could actually mean much better performance than they were getting with their previous solutions. Of course that may not apply to those of us who are already concerned enough about performance to play by the rules, so again the audience needs to be considered here...

    --
    Adam Machanic
    whoisactive

  • Adam Machanic (9/22/2008)


    Now if we could just get stored procedures that exposed well-defined output contracts, the sweet spot would be clear: LINQ to SQL to stored procedures.

    Unless I'm missing something in what you're talking about you can already use stored procedures with Linq. They're added as functions in the designer and accessed as methods on the DataContext.

  • Timothy (9/22/2008)


    Adam Machanic (9/22/2008)


    Now if we could just get stored procedures that exposed well-defined output contracts, the sweet spot would be clear: LINQ to SQL to stored procedures.

    Unless I'm missing something in what you're talking about you can already use stored procedures with Linq. They're added as functions in the designer and accessed as methods on the DataContext.

    you're right - you can. But it's static, and a manual process (whereas the "direct data access" method for creating the context based on the tables is easy to automate both on creation and on update). In an extensive data model, that becomes a reasonably gruelling process. And you can't "map" the same object to multiple stored procs, etc... so the object model doesn't work anymore (since you'd have multiple "objects" instead of one data object being served by multiple procs, etc....).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Adam Machanic (9/22/2008)


    Rico Mariani says that it performs 93% as well as a straight SqlDataReader.

    http://blogs.msdn.com/ricom/archive/2007/07/16/dlinq-linq-to-sql-performance-part-5.aspx

    Hmm - missed that one. Thanks for the link; I will check out.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Timothy (9/22/2008)


    Adam Machanic (9/22/2008)


    Now if we could just get stored procedures that exposed well-defined output contracts, the sweet spot would be clear: LINQ to SQL to stored procedures.

    Unless I'm missing something in what you're talking about you can already use stored procedures with Linq. They're added as functions in the designer and accessed as methods on the DataContext.

    Sure, but how will it deal with the following perfectly valid stored procedure:

    CREATE PROC x

    @i INT

    AS

    BEGIN

    SET NOCOUNT ON

    IF @i = 0

    BEGIN

    SELECT ColA

    FROM SomeTbl

    END

    ELSE

    BEGIN

    SELECT ColB, ColC

    FROM SomeOtherTbl

    END

    END

    This stored procedure mutates its output based on the input argument and therefore cannot be bound properly by LINQ to SQL. Is this kind of code a good idea? Probably not, but I've certainly seen code like this in production environments. What I would like to see in SQL Server is the ability to define a rigid output contract, perhaps via output TVPs, something like:

    CREATE TYPE t1 TABLE (ColA INT)

    CREATE TYPE t2 TABLE (ColB VARCHAR(20), ColC VARCHAR(20))

    CREATE PROC x

    @i INT,

    @t1 t1 OUTPUT,

    @t2 t2 OUTPUT

    AS

    BEGIN

    SET NOCOUNT ON

    IF @i = 0

    BEGIN

    INSERT @t1

    SELECT ColA

    FROM SomeTbl

    END

    ELSE

    BEGIN

    INSERT @t2

    SELECT ColB, ColC

    FROM SomeOtherTbl

    END

    END

    ... this would allow LINQ and other ORM type tools to bind to stored procedures without trying to determine the output using FMTONLY and other hacks. It would make the whole thing much safer and more verifyable. Alas, this is just a pipe dream at the moment, but I hope the SQL Server team will get the message and implement something like this in a coming version.

    --
    Adam Machanic
    whoisactive

  • Adam Machanic (9/22/2008)


    Sure, but how will it deal with the following perfectly valid stored procedure:

    CREATE PROC x

    @i INT

    AS

    BEGIN

    SET NOCOUNT ON

    IF @i = 0

    BEGIN

    SELECT ColA

    FROM SomeTbl

    END

    ELSE

    BEGIN

    SELECT ColB, ColC

    FROM SomeOtherTbl

    END

    END

    This stored procedure mutates its output based on the input argument and therefore cannot be bound properly by LINQ to SQL. Is this kind of code a good idea? Probably not, but I've certainly seen code like this in production environments. What I would like to see in SQL Server is the ability to define a rigid output contract, perhaps via output TVPs, something like:

    CREATE TYPE t1 TABLE (ColA INT)

    CREATE TYPE t2 TABLE (ColB VARCHAR(20), ColC VARCHAR(20))

    CREATE PROC x

    @i INT,

    @t1 t1 OUTPUT,

    @t2 t2 OUTPUT

    AS

    BEGIN

    SET NOCOUNT ON

    IF @i = 0

    BEGIN

    INSERT @t1

    SELECT ColA

    FROM SomeTbl

    END

    ELSE

    BEGIN

    INSERT @t2

    SELECT ColB, ColC

    FROM SomeOtherTbl

    END

    END

    ... this would allow LINQ and other ORM type tools to bind to stored procedures without trying to determine the output using FMTONLY and other hacks. It would make the whole thing much safer and more verifyable. Alas, this is just a pipe dream at the moment, but I hope the SQL Server team will get the message and implement something like this in a coming version.

    In those cases you'd have to write some code yourself in the designer, it wouldn't be automatic, but not entirely difficult I don't think. MSDN has a how to article about it: http://msdn.microsoft.com/en-us/library/bb399371.aspx. I used that technique to return multiple results from a stored procedure. Instead of just returning IMultipleResults from that VariableResultShapes method you could interrogate the results to determine what kind of results you're getting. But then you'd have the issue of what type to return from the method call...meh, never mind.

  • Timothy (9/22/2008)


    In those cases you'd have to write some code yourself in the designer, it wouldn't be automatic...meh, never mind.

    Exactly; doing that work would defeat the whole purpose of LINQ. And it still wouldn't be compile-time verifiable.

    --
    Adam Machanic
    whoisactive

Viewing 15 posts - 31 through 45 (of 51 total)

You must be logged in to reply to this topic. Login to reply