August 22, 2007

 LINQ to SQL: Many-To-Many Tables and Joins

Still having fun with LINQ to SQL over here. One quick note, I've found that the best way to test LINQ queries is to have a unit test class ready to go. Makes things much easier.

So what I've been playing with lately is using LINQ to SQL for querying against Many-To-Many tables. For an illustration, I'll use a table structure like this: (click on it to see a bigger version)


Once imported into your Linq to Sql file type in visual studio, and viewed through a class diagram, you get this:



In my Customer table I have 5 customers, I also have 5 products (Product ID=1 is "Computer"). The CustomerProduct table has about 20 records.

Now we can start having fun with some queries. The question I wanted to answer was: "Which customers have a particular product?". Simple enough.

I found, as with many things, there are multiple ways of getting the same answer. My first query looked like this:


int Computer = 1;
LinqTestsDataContext cx = new LinqTestsDataContext();
var customersWithProduct = from c in cx.Customers
from cp in c.CustomerProducts
where cp.ProductID == Computer
select c;


OK, background note. This code is using the LinqTestsDataContext object to query the Customer table and the CustomerProduct table to find all of the customers with a Computer (product id = 1). What is returned is a IQueryiable object. If I want to parse through each Customer object indivitually, can call customersWithProduct.ToList() and get a List object returned.

This lovely piece of Linq generates the following SQL code:


{SELECT [t0].[ID], [t0].[Name]
FROM [dbo].[Customer] AS [t0], [dbo].[CustomerProduct] AS [t1]
WHERE ([t1].[ProductID] = @p0) AND ([t1].[CustomerID] = [t0].[ID])}


How do I know that is the SQL that is generated? After I run that line I can mouse over the variable (customersWithProduct) and the tooltip displays the generated SQL. I cant change it (that I know of), but at least I can look at it.

Anyway, that is not what I would call the best SQL I have ever seen -- and it is slow.

Next came attempt two at Linq to SQL. I wrote this:


int Computer = 1;
LinqTestsDataContext cx = new LinqTestsDataContext();
var customersWithProduct = from c in cx.Customers
join cp in cx.CustomerProducts on c.ID equals cp.CustomerID
where cp.ProductID == Computer
select c;


In the previous example, I used that Customer.CustomerProducts object to filter the products. This time I am explicitly joining the Customers and the CustomerProducts tables together in Linq. The only odd part of the query was the "equal" keyword that you have to use in the join.

The SQL generated was much better:


{SELECT [t0].[ID], [t0].[Name]
FROM [dbo].[Customer] AS [t0]
INNER JOIN [dbo].[CustomerProduct] AS [t1] ON [t0].[ID] = [t1].[CustomerID]
WHERE [t1].[ProductID] = @p0}


Look: an actual join. Trust me, this works much faster. The first query took 3.24 seconds, the second took 0.06 seconds. I call that significant. Especially considering the amount of data I am querying (not much). You add some real data (thousand and millions of records) and you could be talking about some significant downtime.

Lots more to discover here. All a matter of time.

Labels: , ,

0 Comments:

Post a Comment

<< Home