Blog Post

Let’s Play, “Why did it do that?”

,

On my current project I decided to be flexible and allow the use of Linq to SQL to try to speed up development.  This will not be a large, heavily used database, so squeezing every drop of performance out of the database is not a big issue.  One of the reasons I agreed to it was so that I could see Linq to SQL in a real-world project and the queries generated.  Well, I saw my first “Why did it do that?” query.

Here’s a similar, simplified schema:

CREATE TABLE dbo.students
(
student_id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
last_name VARCHAR(20),
first_name VARCHAR(30),
birth_date smalldatetime
)
CREATE INDEX IX_last_name_first_name ON dbo.Students(last_name, first_name);

CREATE TABLE dbo.classes
(
class_id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
class_name VARCHAR(25)
)
CREATE INDEX IX_class_name ON dbo.classes(class_name);

CREATE TABLE dbo.student_classes
(
student_class_id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
student_id INT REFERENCES dbo.students(student_id),
class_id INT REFERENCES dbo.classes(class_id)
)
CREATE UNIQUE INDEX UX_student_id_class_id ON dbo.student_classes(student_id, class_id)

Here’s what Linq generated (modified to fit the new schema):

exec sp_executesql 
N'SELECT
[t0].[student_ID] AS [Id],
[t0].[First_Name] AS [FirstName],
[t0].[Last_Name] AS [LastName],
[t1].[student_class_ID] AS [Id2],
[t1].[class_id] AS [ClassId],
[t1].[class_name] as [ClassName]
(
SELECT
COUNT(*)
FROM
[dbo].[student_classes] AS [t3] INNER JOIN
[dbo].[classes] AS [t4] ON
[t3].[class_id] = [t4].[class_id]
WHERE
[t3].[Student_ID] = [t0].[Student_ID]
) AS [value]
FROM
[dbo].[students] AS [t0] LEFT OUTER JOIN
(
[dbo].[student_classes] AS [t1] INNER JOIN
[dbo].[classes] AS [t2] ON
[t1].[class_id] = [t2].[class_id]
) ON
[t1].[student_ID] = [t0].[student_ID]
WHERE
([t0].[Last_Name] LIKE @p0) AND
([t0].[First_Name] LIKE @p1)
ORDER BY
[t0].[student_ID],
[t1].[student_class_ID],
[t2].[class_id]'
,
N'@p0 varchar(8000),@p1 varchar(8000)',
@p0 = 'Abb%', @p1 = 'Abb%'
Note the correlated sub-query to return the count of classes by student.  This is not in the Linq query in .NET, but is added by the framework because of the one-to-many relationship between students and student_classes.  Basically the student object contains an EntitySet of student_class objects and it looks like L2S needs to know the number of objects in the list.  If this is needed you can return it by replacing the correlated sub-query with a CTE or derived table. Here’s an example of what I’d write:
 
SELECT
S.student_id,
S.last_name,
S.first_name,
SC.student_class_id,
SC.class_id,
C.class_id,
C.class_name,
CC.class_count
FROM
dbo.students AS S LEFT JOIN
(
dbo.student_classes AS SC JOIN
dbo.classes AS C
ON SC.class_id = C.class_id
)
ON S.student_id = SC.student_id LEFT JOIN
(
SELECT
SC2.student_id,
COUNT(SC2.student_class_id) AS class_count
FROM
dbo.student_classes AS SC2
GROUP BY
SC2.student_id) CC
ON S.student_id = CC.student_id
WHERE
S.last_name LIKE 'Abb%' AND
S.first_name LIKE '%'
ORDER BY
S.student_id,
SC.student_class_id,
C.class_id

This eliminates the correlated sub-query and eliminates a second scan of the classes table.  I’m still trying to understand why Linq needs the count.  Any other suggestions?

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating