August 24, 2007

 LINQ to SQL: Interfaces on your Domain objects

There has been some gowning about the lack of interfaces built into the domain objects produced by the Linq to SQL code generator.

Note: a domain object, as I am using the term here, is an object that maps column-to-property with a table. For example, if you have a Customer table with an ID and a Name field, you will have a Customer class with an ID and Name property.

I found that it is trivially easy to create. Every object in Linq to SQL is a partial class. So create a partial class that maps to the generated partial (there is no good way to say that, I swear). Then you can use the built in refactoring "Extract Interface" to create an interface that implements the fields in domain object.

The key point that I did not realize is this: you can assign an interface to class in any of the partial class declarations.

So, in Linq to SQL, keeping with our Customer table paradym, your generated partial class will look like this:

public partial class Customer : INotifyPropertyChanging, INotifyPropertyChanged

Now your other partial class (in an entirely different file) will look like this:

public partial class Customer : ICustomer

Both declarations define interfaces. I didn't know you could do that.

Now the cool part is that the interface can be defined anywhere. If you are trying to create a true separation between layers you will probably want to define it an an assembly other than the assembly that the domain classes are defined in.

Anyway, that is my discovery for today. Have a good weekend.

Labels: , ,

 August 22, 2007

 LINQ to SQL: Many-To-Many Tables and Joins

Still having fun with LINQ to SQL over here. One quick note, I've found that the best way to test LINQ queries is to have a unit test class ready to go. Makes things much easier.

So what I've been playing with lately is using LINQ to SQL for querying against Many-To-Many tables. For an illustration, I'll use a table structure like this: (click on it to see a bigger version)


Once imported into your Linq to Sql file type in visual studio, and viewed through a class diagram, you get this:



In my Customer table I have 5 customers, I also have 5 products (Product ID=1 is "Computer"). The CustomerProduct table has about 20 records.

Now we can start having fun with some queries. The question I wanted to answer was: "Which customers have a particular product?". Simple enough.

I found, as with many things, there are multiple ways of getting the same answer. My first query looked like this:


int Computer = 1;
LinqTestsDataContext cx = new LinqTestsDataContext();
var customersWithProduct = from c in cx.Customers
from cp in c.CustomerProducts
where cp.ProductID == Computer
select c;


OK, background note. This code is using the LinqTestsDataContext object to query the Customer table and the CustomerProduct table to find all of the customers with a Computer (product id = 1). What is returned is a IQueryiable object. If I want to parse through each Customer object indivitually, can call customersWithProduct.ToList() and get a List object returned.

This lovely piece of Linq generates the following SQL code:


{SELECT [t0].[ID], [t0].[Name]
FROM [dbo].[Customer] AS [t0], [dbo].[CustomerProduct] AS [t1]
WHERE ([t1].[ProductID] = @p0) AND ([t1].[CustomerID] = [t0].[ID])}


How do I know that is the SQL that is generated? After I run that line I can mouse over the variable (customersWithProduct) and the tooltip displays the generated SQL. I cant change it (that I know of), but at least I can look at it.

Anyway, that is not what I would call the best SQL I have ever seen -- and it is slow.

Next came attempt two at Linq to SQL. I wrote this:


int Computer = 1;
LinqTestsDataContext cx = new LinqTestsDataContext();
var customersWithProduct = from c in cx.Customers
join cp in cx.CustomerProducts on c.ID equals cp.CustomerID
where cp.ProductID == Computer
select c;


In the previous example, I used that Customer.CustomerProducts object to filter the products. This time I am explicitly joining the Customers and the CustomerProducts tables together in Linq. The only odd part of the query was the "equal" keyword that you have to use in the join.

The SQL generated was much better:


{SELECT [t0].[ID], [t0].[Name]
FROM [dbo].[Customer] AS [t0]
INNER JOIN [dbo].[CustomerProduct] AS [t1] ON [t0].[ID] = [t1].[CustomerID]
WHERE [t1].[ProductID] = @p0}


Look: an actual join. Trust me, this works much faster. The first query took 3.24 seconds, the second took 0.06 seconds. I call that significant. Especially considering the amount of data I am querying (not much). You add some real data (thousand and millions of records) and you could be talking about some significant downtime.

Lots more to discover here. All a matter of time.

Labels: , ,

 August 20, 2007

 LINQ to SQL: SQL IN clause

Here is a problem I recently had to figure out with LINQ: how do you do an IN with LINQ for SQL?

In standard SQL your query would look like this:

SELECT myColumn FROM myTable
WHERE myColumn IN (myVal1, myVal2, myVal3)
But lets put another kink in, shall we. I'm not using LINQ, I'm using Lambda.

It turns out my savior was Contains.

Every generic List ( List ) with System.Linq available has a Contains< > extension method (this is .Net 3.5 we are talking about here), and that is what you use.

So, when creating a new query via LINQ for SQL in Lambda, you get a IQueryable interface with a Where extension method ...

Oh crap, here is the code, this will take to long to fully explain:

MyDataDataContext cx = new MyDataDataContext(); // from the LINQ to SQL dbml
IQueryable q = cx.MyTables.AsQueryable(); // creates the query object

List listOfData = {1,2,3,4};

q.Where( x => listOfData.Contains(x.MyIntValue));

var result = q2.Select(x => x.MyIntValue);
So what have we, and what was created. Well, we just used LINQ to SQL to generate a query that will look like this:

SELECT [t0].[MyIntValue]
FROM [dbo.MyTable] AS [t0]
WHERE [t0.][MyIntValue] in ( @p1, @p2, @p3, @p4 )
There is a @p[number] for every value in the list 'listOfData' above. And to me, the generated SQL is pretty good. That is probably what I would write.

Labels: , ,

 July 28, 2007

 C# Parital Methods

Update: Orcas Beta 2 is out, and I found a problem in my syntax. Now fixed.

I was just looking through the Orcas Beta 1, and I saw something I have been waiting to talk about...Partial Methods!

This allows you to create a method sub in a partial class so you can provide the implementation in another partial class.

A case where I would use this: in code generation. You generate a data object as a partial class. Now say you want to customize the constructor a bit more. Current you have to use inheritance to do that -- but then you have to remember to use the inherited class and not the generated calss. But with partial methods you can use the main class and provide that logic. Also, if the partial method is not implemented in the other partial class, it is taken out by the compiler. Nice.

A second use: logging. You can stub out methods where you would like logging to occur, but you don't want the partial class to have to deal with the implementation of logging, making it easier to switch out later.

I hope you see why I am excited about this enhancement.

A couple of caveats though,
  1. the method must return void (or a sub for you VB folks).
  2. out parameters are not allowed -- but parameters are allowed in general.
  3. You are not allowed access to internal/private members
  4. They can only be defined and implemented in a partial class.

Below is a stub implementation:

// my generated class in file1.cs
public partial class Class1
{
// a partial method
partial void CalledByInit();

public void Init()
{
CalledByInit();
}

}

// my implementation class in file2.cs
public partial class Class1
{
public void CalledByInit()
{
// do my stuff here
}
}

Now why is this better than:
1. just declaring a delegate in the middle of your class?
2. passing in the code as a predicate parameter?

The answer:
1. A delegate can be assigned by anyone, and potentially have multiple implementations. That isn't always good. The partial method can only have one implementation and only by other partial class.
2. aesthetically that makes a really ugly call, plus it requires that the caller have intimate knowledge of the workings of the method that it shouldn't have to have. Now you can localize that knowledge in one place.

Again, thing about the main use case for the partial method: it should be used in conjunction with code generators. If any of you have used SubSonic, Typed Datasets, NHibernate, or other technologies, right now you are using a mixture of inheritance and partial classes. With this you might be able to drop the inherited class all together. And frankly, I'm for anything that reduces the number of classes lying around a project.

If you are interested in other new features in C# 3.0 I found a nice Power Point from Raj that you can check out.

Partial Methods are mentioned on LukeH's blog.
And they are talked about on the VB Team blog.

Labels: , ,

 July 24, 2007

 3 months, 3 LINQ presentations

That is right, in the past 3 months I have given the same basic presentation three times. I don't know if that counts as a groove or a rut. But there are some nice things about doing that: don't need new Power Point for one. But more importantly: more questions that make you question things more.

Starting off, I'm no LINQ expert. Yet I have to distill it to the group. Luckily LINQ is an easy sell. There is something there for everyone in LINQ. But when it gets right down to it, what is LINQ about? I put it like this: FOR loops are evil and LINQ is the cure.

FOR loops are not run for cover and grab a Bible evil, more of a general GOTO type evil. It isn't as if GOTO is evil in itself, in some languages the GOTO is a required statement. But like all inherently benign language constructs, in the wrong hands it can go really badly.

For myself, I've seen some of the worst code in my life nestled in for loops. And even worse, most people don't even know it. How would they? You could say that they just don't know any better. But in reality, there often isn't a better way.

What is the FOR loop but a structured GOTO. Really, that is it. And it isn't a very thick abstraction. If you don't believe me, go check out assembly language. Same with WHILE.

Next, what are you doing in the FOR loop (looping through a list -- DUH)? Sorry, I need a better question: what are you trying to accomplish with the loop? Now, look at the loop, and how easy is it to figure that out after the fact?

There is a reason people don't like assembly anymore, it is too hard to understand after the fact AND it is to hard to write in the first place. There are too many moving parts. Even adding two number (registers) is a multi-step operation. Things you do in loops have many of the same qualities.

Here is an example: find the largest value in an array of integers.

First the array:
int[] i = int[]{1, 2, 3, 4, 5, 6, 7 };

Here is what you write in C#:

int iMax = i[0];
foreach(int j in i)
{
if (iMax < j)
iMax = j;
}

Here is what you would write thanks to LINQ (and Extension Methods):

int iMax = i.Max();

How many ways are there for the first code to go wrong? There are 6 lines, 4 of them have code. There is one obvious bug in the code anyway...what if the list has no items? You will get an index out of bounds error right there. But there are many ways to incorrectly write this code. This is also a simplistic example, so imagine how bad this can get when doing real code.

In the second example: I can't find one. Plus, there is very little chance that you, or someone else, will not understand what the code is doing.

Now this is simplistic, which is bad because it hides the true power that is hiding underneath. There is more to link than grab bag of small statistic functions (e.g. Sum, Min, Max, Count). Add in a complex object and the Where method and we begin to see.

Visualize a customer object. It will have properties like FirstName, LastName, Address, City, etc. This in in a CSV file that is coming from Sales and Marketing.

First part: load the CSV into your program. No problem, we have all done that from time to time. Now find me all of the people in Idaho. Crap.

Not in LINQ. If you loaded your data into a list (List list) you would write code like this:

var idahoCustomers = from c in list
where c.State = "ID"
select c;

Want that in Lambda:

var idahoCustomers = list.Where(c => c.State == "ID");

Something you should know about now, there are two ways of doing the same thing, and you should probably know both. First is LINQ. If you see "from blah blah where blah blah select blah blah" -- you are looking at LINQ. If you see a "=>" you are looking at Lambda.

Personally, I love Lambda more than LINQ. Lambda can do everything LINQ can do, plus everything else. Another way of saying that is "LINQ is a subset of Lambda."

Anyway, this post could run on and on about the wonders of LINQ and Lambda -- but there are plenty of other people doing that. Hopefully, you have already read some of that. Where I want to finish off with is a few suggestions for anyone looking to get a grip on all of this.

First, there are a lot of new things to learn these days. WPF, WF, WCF, LINQ, Lambda, etc. Is this different? Yes it is. You need to learn LINQ. Personally I will be asking interview questions based on link in the future.

Second, considering you have limited time, what should you concentrate on? My answer is Lambda and Extension Methods. The more you learn about Extension Methods the more you will be able to do with Lambda. (Warning, if you are going to learn Extension Methods, you should probably learn about Predicates as well).

And some words of warning. Watch your return value types. You will see a lot of IQueryable, IEnumerable, and other strange interfaces as return types. These will often be hidden in 'var's. Be warned, each has its own capabilities, and you should know how to convert between them.

For example. In a List object, you get the ForEach extension method. You don't get that with IEnumerable or IQueryable. But you can get there by calling ToList() on either of those object types.

Finally: measure. Grab a profiler and run with it. Just like FOR saved you from the uglyness of GOTO, LINQ saves you from the complexity of FOR. But it doesn't get you away from the costs. There will still be times when it is better to write the loop yourself. A good profiler will tell you when.



Oh, one final note: I'm using Visual Studio.NET 2008 Beta 1 like everyone else. All code samples are subject to change when Beta 2 releases this week.

Labels: , , ,

 June 28, 2007

 NetDug last week...

I thought this was funny.

Last week I gave a presentation on C# 3.0 and LINQ to NetDug. Mind you, this is a really well run group. We meet at the ProClarity ---erm Microsoft -- building in downtown Boise. Usually someone from Microsoft lets us in, guards over us, and things go really well. Not this time.

It started with one of the group leaders sending me an email, the day before the meeting, telling me that he could not get onto their server, ergo: they couldn't send an email to the group telling them that the meeting was on, and what we were talking about.

Not great, but I can work around that. I sent out a message to the BSDG group, which is also largely Boise area developers and told them. It is largely the same people anyway.

One day goes by.

OK, day of the meeting. I show up and there are already people there waiting outside the building. I'm early so I don't think much of it. The Microsoft building here in Boise is actually a nice spot. There is this cool little spot that had some places to sit in the shade with lots of trees, which was needed because it was over 80 degrees at the time. So we just sat there until someone was going to open the building for us.

Janitors walked in, a few people walked out. I even knew a few of them. It was getting really close to the time of the meeting so I started talking to one of them as they came out. Asking if certain people who are usually there to open the building for us are still in the building to open the building so we can have our meeting. (yes, that is a run on sentence, but appropriate since the person I talked to was a tech writer -- who taught tech writing). They were not there.

Great. First there is no email to tell anyone about the meeting we were supposed to have, and now we don't even have a room to have it in. This is getting better all the time.

By this time there were about eight people there. That is a small gathering for this group. It usually gets 20 people. But, they were still interested in what I had to show them. But we were outside, and my laptop is useless outside running on battery power -- so no power point. There are worse things in life than giving a talk with no power point, really. So next best option: wing it with a pen and paper.

This is where it is a good thing that there were only eight people there. So I started the meeting, outside, and started talking. One thing that does happen when giving a talk like this, you cut out all of the extraneously stuff.

Array initializers -- didn't talk about it.
Extension Methods -- yes, but just enough
Object and Collection initializers -- just barely
Expression Trees -- mentioned that I didn't know anything about them.

But we did talk quite a bit about the var keyword, anonymous types, LINQ, and Lambda. All via pen and paper (which I now refer to as the original Power Point).

So, obviously, I wasn't trying to get the attendees to really grok the material, but I think they did capture some of the general zen. Which, as far as I can tell, is to rethink how and when you use a for loop on a list. With LINQ and Lambda, we should be seeing a lot fewer of them.

In the process we also talked about PLINQ, XLINQ, DLINQ, LINQ for SQL, NHibernate, SubSonic, and Log4Net. It was a good meeting. Not bad for considering the circumstances.

Then to close off the evening for myself, my mom and brother were attending a dairy conference a few blocks away, so I snuck into there and bored myself to sleep. They were talking about whey futures (as in stock market like futures).


I thought I had given a reasonable presentation with no slides earlier, on the street, in 80 degree weather. Here was a guy giving a presentation inside, with a huge projector (20 foot screen - at least) to 50 people and doing it badly.



Now all of this comes from my own general preferences. There is an art to displaying lots of numerical data on a slide. There is also an art to showing charts on a slide. This guy knew about neither, and probably never read anything by Edward Tufte.

Note: I have read Edward Tufte, but please don't blame my bad slides on him -- they are my fault for not reading his books enough.

All of his slides were white. All of his text was black. There was no variation. They could have been printed on a black and white printer and no one would have known the difference. Imagine trying to decipher slide after slide filled with large grids of numbers, each row having a different type of number, and only a thin black line between them. Not good. Then to show emphasis on a particular number -- out comes the laser pointer.

I about made my brother buy me a beer after that. And dinner.

My mom did instead.

Labels: , , ,

 June 13, 2007

 This is painful

There is just something painful about listening to the sound of your own voice. Almost as bad as having someone point out all of my grammatical mistakes. Then there is the added pain of watching yourself on video. You get to hear and SEE how goofy you really are.

Anyway, we recorded that last half of our BSDG meeting last Thursday where I was covering C# 3.0. The video misses all of the good stuff (like 'var', collection initializers, object initializers, extension methods, and such) and just focuses on LINQ and Lambda Expressions.

This being the web, there are far better sources of information than this video, but if you are really starved for information -- have at it.

Link to the video.

Labels: , ,