July 18, 2007

 Where are the DSLs?

DSL is a Domain Specific Language. It took me a couple of years to really wrap my head around what the term means. Now I have a bad understanding -- but where are the languages?

First off, in my understanding of a DSL, it might not be a proper definition. Frankly, I don't care if the language is a full on language, with it's own syntax and control structures, or just a set of very specific libraries that add classes and methods for easily reading what is going on inside of an existing language -- like C# or VB.NET.

Personally, I like the second one. I would take the inspiration from SubSonic and NHiberate. Heck, throw in NUnit if you want. The nice thing about using these as your examples is that you know things are easily extendable.

There are a couple that I am thinking of right now. I've been playing around with MSBuild and NANT lately. Those two tools have convinced me that XML sucks as a language format. How do you define a loop in XML? You can't do it in any terse format I can tell you that. XML is for data, not logic. That is why JavaScript doesn't look like HTML.

So no, I dont buy XML as a language. Personally, I have a hard time seeing through all of the angle bracket -- really, I do. They just seem to create a lot of unnecessary noise to me. Readability stinks, which hurts the overall expressiveness of the language -- not to mention aesthetics -- and maintainability is also terrible. Not a whole lot of syntax highlighters for xml these days. Debuggability: None. (is debuggability a word?) Either it works or it doesn't. If it doesn't...may the force be with you.

Anyway, back to the build scripts example. Basically, even if you have sample build scripts to work off of, it will easily waist one day of a developer's life to get a build script off the ground. Just to get started. Then countless more hours keeping the stupid thing up to date.

Why don't we have a build script DSL? Some already do exist, Ruby Rake is one. I'm considering learning it. I would like a C# based one, but I'll take Ruby if I have to.

Another use that I have thought of often has to do with ETL. Now I find I'm not alone. Ayande has recently been bitten by the bug. Which is good. Because with his given track record he might just be able to do it. Would you rather deal with Integration Services GUI or a language specifically designed for that purpose.

Bear in mind, these examples break down at one crucial point: threading. Both of the examples mentioned, ETL and Build Scripts, can benefit heavily from working in parallel. And that is one area that our current languages don't help us with a whole lot yet. It is easy to tell Integration Services to load a bunch of tables at the same time in the GUI -- that is a lot more work in C#.

Now it can all be mocked with the Unit of Work pattern so everything can be batched together. That would hide the complexity from the user of the language at the very least. But that still leaves a lot of complexity behind the scenes.

Which leads me to one of the new great hopes coming from Microsoft sometime in the future: PLINQ. Aka: Parallel LINQ. LINQ is all about building a better FOR loop. But it is still iterating over a list one item at a time. PLINQ takes things that next crucial step: multi-threading the iteration. And all without changing the syntax of LINQ. Now that is flippen cool. Unfortunately, I have heard no release date for PLINQ, and it probably wont come out with .NET 3.5 at all.

Anyway, those are my thoughts. I could go on, but I need to get back to work.

Labels: , , , ,

 October 03, 2006

 Golden Rules of OLAP

I've been meaning to say this for a while now...

When working with an OLAP tool (say Microsoft Analysis Services 2005 -- or 2000 -- or Hyperion -- or any OLAP tool under the sun) there are three guiding principles.

#1: Know your data. It doesn't matter how snazzy your tools are, how good looking your web site/reports are if all the data is meaningless to you. Yes, it might mean something to your customer, but until it means something to you, you are going to have a hard time really helping the customer.

#2. Know your display tools. Depending on what tool your customers are going to use to see the data, it will change how you construct your cubes and dimensions. There are things you can get away with when your client is ProClarity that are a bad idea for Reporting Services, and a terrible idea for Excel.

Case in point is the naming of attribute dimensions. You might have a dimension named Project with an attribute hierarchy named "Name". Seems logical. Then you also have a Customer dimension with a "Name" attribute hierarchy. So in ProClarity or Reporting Services you will see the nice Project.Name and Customer.Name hierarchies. In Excel (via pivot , you will see "Name" and "Name". Not very helpful. So you have to name your attribute hierarchies "Project Name" and "Customer Name" to keep the Excel Pivot table people off of you back.

#3. Regardless of what tool your customer says they are going to use, always test the data in Excel -- especially Date dimension data. Why? Because some customer will always want to see the data in excel. They don't care that all the data is already in some other tool nice and formatted and pretty, if the data isn't in Excel it isn't useful to them. Just deal with it, you can't cant change their minds.

Also, be sure to test how your date dimension data looks when exported to excel. Excel has this helpful habit of looking at your data, seeing a date, and then formatting it incorrectly. What starts out as April 02 (for April 2002) is suddenly transformed into April 2, 2006. And you will be blamed for this. Not Excel. No, not our precious Excel that can do no wrong (quiet next time, it might hear you). You are in the wrong and must fix this error. And the fix is a four digit year which will anger someone else -- but you will get used to that.

I hope this becomes helpful to someone, it was a painful process for me to get to this point myself.

Labels: , , , , , ,