AJ's blog

July 26, 2008

LINQ 2 SQL — Less is More (More or Less)

Filed under: .NET, .NET Framework, C#, LINQ, Software Architecture, Software Development — ajdotnet @ 2:21 pm

Some time ago I promised some additional thoughts about how LINQ 2 SQL affects (web) application architectures. Well, here we go…

There are two defining characteristics of LINQ 2 SQL that may affect the architecture of a web application:

  1. The DataContext class, being “the main entry point for the LINQ to SQL framework.” (as the documentation states.)
  2. The way code generation works.

While the first point affects what features of LINQ 2 SQL we are able to use in web applications, the second one affects the application architecture.

Note: if you need to catch up with LINQ 2 SQL have a look at the my earlier post.

DataContext Effects

In this post I’ll focus on the DataContext class generated by LINQ 2 SQL.

The DataContext class is not only an anchor to access the tables, views, and stored procedures. It also offers more interesting features like change tracking, maintaining object identities, optimistic concurrency, and deferred loading. Consequently this is what most people I talked to attributed to LINQ 2 SQL: The possibility of reading some data, manipulating it, and call SubmitChanges, while LINQ 2 SQL handles all the gory details of mapping the data, efficiently talking to the database (including load on demand), dealing with concurrent changes, and so on.

This read/manipulate/submit approach is what the documentation describes as being “designed to last for one ‘unit of work’.” And in case the consequence escaped you: This this implies a statefull programming model. The very DataContext instance that was used to read the data has to be available until every potential change or deferred access has been settled.

This does generally not agree with web applications, as read/manipulate/submit regularly spans several requests: The data is read, presented to the user, and then the request dies, and in a stateless scenario the DataContext as well. The next request comes in, say an update request, but there is no DataContext available, carrying information about the data that was read before. In order to work with the laid out programming model, one would have to create a new DataContext, reread the data, and then apply the changes and submit them. With optimistic locking being the first victim, because for that the information from the previous request is mandatory. And delayed loading degenerates to “optional loading”, because either you need the data (no benefit in delaying then) or you don’t.

How about maintaining the DataContext instance across requests? No way either; since it is not serializable, you can’t put it into the session. (And that’s just a technical reason, whether one should do that even if it were possible is a different matter altogether.)

To make a long story short, let me rephrase and emphasize the first sentence:
For web applications the generated DataContext class is only an anchor to access the tables, views, and stored procedures.

BTW: Services are frequently bound to be stateless, too. Thus the same reasoning applies to them as well. Additionally they face another problem: They hand out the entities and get fresh ones from the outside world via serialization. Thereby they are removed from the DataContext, which is kind of unforgiving.

Less is More

Now we have concluded that the statefull approach doesn’t work. Reverting to a stateless employment of LINQ 2 SQL however leaves out a lot and raises the question whether there is something left at all. Well, there certainly is. We still have code generation for entity classes (based on tables and views) and for stored procedures and functions.

May web applications are really quite simple in that they are mainly “browsers over a (one) data store”. The complexity of those kind of applications lies more in sophisticated presentation rather than coding the business logic. In this scenario, and to really benefit from the productiveness of LINQ 2 SQL, I have had good experience using LINQ 2 SQL with just two simple tweaks:

  • Limited number of result set types (i.e. the set of columns):
    Rather than having my SELECTs (e.g. in stored procedures) produce different projections (i.e. tailored to the particular stored procedures task), I let them produce the same set of columns if they work on the same table. This way I will get a Customer entity and respective instances rather than a CustomerWithAddress, CustomerWithRating, CustomerWithName, CustomerWithAllInfo, … . And if I get that customer, I can always be sure it’s not some half filled entity.
  • Explicit result set types:
    It’s quite common to work on JOINed data. LINQ 2 SQL handles this quite well and will figure out the entity type even for stored procedures as “auto-generated type”. In order to make this type explicit in the database I would define a view containing the JOIN and have the stored procedure work against this view. And of course I use the view to generate a respective entity with LINQ 2 SQL. 
  • I also refrain from having stored procedures return more than one result set or varying result set types depending on some condition. (I don’t consider this to be a tweak, because I would do that anyway.)

Of course the database may need some tuning, as some of these tweaks may hurt the performance. But optimization when optimization is due… . And while I like the stored procedure approach, those tweaks will also work with free statements in code.

These tweaks are not meant as rules but as guidelines. Feel free to break them at leisure ;-) , their intention is merrily to support the LINQ 2 SQL code generation. However… I experienced that the define-your-entities-and-the-operations-on-top approach caused more OO-like thinking than the traditional define-your-operations-and-let-them-return-the-result-at-hand approach.

So, after dragging and dropping the tables, views, and stored procedures LINQ 2 SQL will happily generated the respective classes and methods. Getting type safe data and DB access was never easier.

If you are in the “use CRUD statements to access the database” camp, the usual round trip for an update would be: read the data that is to be changed, change it, submit changes. Like this:

using (MyDataContext dc = new MyDataContext())
{
    var customer = (from c in dc.Customer where c.ID == id select c).First();
    customer.Name = “new name”;
    dc.SubmitChanges();
}

Compared to ADO.NET this still eases db access a lot.

If you follow the “use stored procedures to access the database” approach it gets even simpler:

using (MyDataContext dc = new MyDataContext())
{
    dc.ChangeCustomerName(id, name);
}

This is kind of “anti-sophisticated” but quite efficient. And it allows to do concurrency in stored procedures if you maintain the version field on the client and provide it during the update call. I found this to be much more concise and better suited to the way web applications work.

One word of warning (or rather a disclaimer if you like): This works for me. And it works for a certain kind of web applications. And it only works because I’m willing to trade some academic architectural demands for the KISS principle, productivity, and maintainability. So please don’t follow this advice blindly.

Oh, by the way: I used this approach in a project using an Oracle database as well. Writing a simple code generator that read the system tables and generated wrappers around stored procedures was no rocket science, the only caveat was dealing with REF CURSORs.

The next post will take a closer look at LINQ 2 SQL code generation and how it affects the architecture.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

April 27, 2008

I can’t help but notice…

… that functional programming might be the next big shift in general purpose development paradigms.

The other day I realized a trend that may be decisive for the evolution of our development environment in general: Functional Programming! It is not that I did not see the symptoms. It is just that it took that long to realize the breadth of it. (I’m in good company, though.)

Yesterday…

… it started quite unobtrusive.

We got generics and with them (hardly mentioned as more than syntactic sugar) type inference.

In retrospect it even seems misleading that generics were discussed as something akin to C++ templates. The technical differences may seem superfluous at first, yet some advanced techniques with templates, orthogonal libraries such as STL (only recently available as STL/CLR for managed code) and template meta-programming have never been broadly considered for generics. Rather type inference plays a major role with generics, effectively pushing the burden of type safety from developer to compiler.

We got anonymous methods, which was at that time hardly worth the effort and not at all a justified language change — if looked at in isolation.

We got iterators, which was again a feature no one had missed, much less asked for.

Several features, seemingly unrelated, some rather tiny unmotivated additions to the language.

Today (the obvious)…

… we have LINQ.

LINQ has been introduced at the PDC 05 by Anders Hejlsberg et al. as kind of “complied SQL statements”, “type safe SQL”, and “unified query language”, thus the name of course.

The language features that make up LINQ rely on the above language features. And with LINQ those pieces suddenly fall into shape. Type inference for anonymous types and together with generics as general foundation, anonymous methods for lambdas, iterators for deferred execution. A concise system.

And this is how LINQ is seen by many people: Type safe queries. Something to replace ugly ADO.NET code over SQL strings with beautiful and compiler checked C# statements. Who can blame them, that’s how it got introduced.

What those people miss is the less obvious force behind LINQ. It’s functional programming, as Anders itself explains — not the other way round, as presented in “The Evolution Of LINQ And Its Impact On The Design Of C#”.

It’s not that the trend to functional programming is kept secretly. Matter of fact, the term is mentioned quite frequently in relation to LINQ. And there’s other evidence, like the new .NET language F#, incidentally written by Don Syme, the man who built generics for .NET. And there’s enough bloggers out there, talking about the topic.
It’s more that the average corporate developer, the kind that has more than enough to do getting his job done, is not directly exposed to this information. Therefore he picks up only occasional hints at best, not connected, not nearly enough to identify a larger trend. Many of them may not even be aware about the degree to which they have already exposed to functional programming bits and pieces.

Today (the not so obvious)…

… there is a lot of talk around functional programming going on in the blog sphere. Some examples:

Regarding lambdas have a look at Wes’ blog to learn about currying, continuations, monads. Other bloggers have written about other language features (e.g. extension methods). Luca tries to bring F# features to C#.

Also there has been a lot of noise around one of the tenets of functional programming: immutable data types. Just as we got new collection classes with generics, we may get immutable ones sometime in the future.

If you like to read up on immutable data types, Roy has compiled a list of related blog posts; I found Erics series to be the best, as he addresses not only the purpose but also the feasibility (consider the insolence, a list that does not change, how can that be feasible?). And he addresses null values (not exactly part of the functional universe) better than the others.

Another interesting application is MoQ (if not by itself, then as an example for the functional approach), a LINQ based mock framework. It may serve as indication that even stuff that is not directly related to functional programming may come in a functional shape.

Tomorrow…

… — and this is strictly a personal guess — the trend will continue. If anything it will diverge in different areas. Functional programming in general and F# idioms in particular may contribute to our languages, the patterns we use, and the libraries. F# may not only be the first functional language to decisively enter the .NET environment, it may be the very first functional language with at least the potential to become a mainstream language

The driving force behind this is something that Anders hinted at it in his interview above, and that Pat Helland pointed out quite clearly at the TechEd: The future is parallel. And functional programming is so much more suited to parallel processing than imperative programming. Matter of fact, Parallel LINQ (PLINQ) is already working in that direction.

Right now…

… that’s enough gazing in the crystal ball. What’s the consequence (provided you share my view)? Today probably nothing. None of my daily work is affected, beyond using LINQ as a nice tool. Tomorrow perhaps still nothing, as I may be totally wrong. All I can recommend is this: Try to see the bigger picture behind LINQ and keep watching for the trend. For if it’s the other way round, you may be tomorrows next VB guy, like the one in the next cubicle that never mastered inheritance and exceptions.

As an afterthought: I’m not aware that this functional programming trend happened somewhere outside the Microsoft space. This is contrary to the dynamic languages trend. And I’m honestly not sure what the implication is.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

April 20, 2008

LINQ to SQL Reality Check

Filed under: .NET, .NET Framework, C#, LINQ, Software Architecture, Software Development — ajdotnet @ 2:37 pm

After the introduction and checking LINQ to Objects, this is the second part of the Reality Check story.

Employment of LINQ to SQL

OK, off with the gloves and hard core LINQ ;-)

image But first some background: LINQ to SQL manifests itself as a .dbml file supported in VS2008 by a respective designer. Via server explorer one can add tables, stored procedures, and other DB objects to the designer. SQLMetal, the code generation tool behind the scenes of LINQ to SQL, is used to generate code for the data context class with static methods for stored procedures. Tables are mapped to entity classes, columns to properties, in a trivial fashion but non-trivial way. E.g. the ID column:

[Column(Storage="_CustomerID", DbType="UniqueIdentifier NOT NULL")]
public System.Guid CustomerID
{
     get
     {
         return this._CustomerID;
     }
     set
     {
        if ((this._CustomerID != value))
        {
            if (this._Customer.HasLoadedOrAssignedValue)
            {
                throw new System.Data.Linq.ForeignKeyReferenceAlreadyHasValueException();
            }
             this.OnCustomerIDChanging(value);
             this.SendPropertyChanging();
             this._CustomerID = value;
             this.SendPropertyChanged(“CustomerID”);
             this.OnCustomerIDChanged();
         }
     }
}

As you can see, the set method goes at length to announce the change of the value. The generated entity properties support various “events” by means of partial methods, and the entity classes implement respective interfaces (INotifyPropertyChanging and INotifyPropertyChanged).
In addition to properties for columns, the entity classes provide properties for relationships (supporting lazy loading) if they have been modeled accordingly.

All generated classes are realized as partial classes and can be adorned with custom code without problems. All classes are created in one big source file, one can however influence the class names, namespaces and some other parameters that affect code generation.

Now back to the prototype application…

The prototype application uses “poor man’s O/R mapping”, i.e. data readers to access the DB and manual mapping to POCOs. It even doesn’t use reflection (dude, it’s a prototype application), thus it is by no means representative.

Anyway, I’ll just present old and new code as before, again, relevant parts in bold. 

3. Example: Reading a customer (single object) from the database.

const string SELECT_CUSTOMER_BY_ACCOUNT = “select * from customer where MainAccount=@MainAccount”;

Customer FindCustomer(int mainAccount)
{
     Customer c;
     using (SqlConnection con = OpenConnection())
     {
         using (SqlCommand command = new SqlCommand(SELECT_CUSTOMER_BY_ACCOUNT, con))
         {
             AddParameter(command, “MainAccount”, mainAccount);

             using (SqlDataReader reader = command.ExecuteReader())
             {
                 while (reader.Read())
                 {
                     c = FillCustomer(reader);
                     return c;
                 }
             }
         }
     }
     return null;
}

private static Customer FillCustomer(SqlDataReader reader)
{
     Customer c = new Customer();
     string name;
     for (int i = 0; i < reader.FieldCount; ++i)
     {
         name = reader.GetName(i).ToLower();
         switch (name)
         {
             case “id”: c.Id = reader.GetGuid(i); break;
             case “mainaccount”: c.MainAccount = reader.GetInt32(i); break;
             case “name”: c.Name = reader.GetString(i); break;
         }
     }
     return c;
}

Nothing but boilerplate code. And the rewrite using LINQ to SQL:

CustomerLinq FindCustomerLinq(int mainAccount)
{
     DataClassesDataContext dc = new DataClassesDataContext();
     var customerList = from cc in dc.CustomerLinqs where cc.MainAccount == mainAccount select cc;
     return customerList.FirstOrDefault();
}

About 35 LOC (still 20 LOC if the mapping is left out) vs. 6 LOC. No need to comment on that I guess…

To provide a complete picture (and because I didn’t want to change the whole application to new entity classes) I also tried the “separate entity assembly” approach. This code also maps the generated entities to layer independent entities:

Customer FindCustomerLinq2(int mainAccount)
{
     DataClassesDataContext dc = new DataClassesDataContext();
     var customerList = from cc in dc.CustomerLinqs where cc.MainAccount == mainAccount select cc;
     var cl= customerList.FirstOrDefault();
     if (cl!=null)
         return new Customer() { Id= cl.ID, MainAccount= cl.MainAccount, Name= cl.Name, TotalBalance= cl.TotalBalance };
     return null;
}

As there was no appropriate constructor, object initializers kicked in nicely. Of course this would be the next mapping method, but not to complicated and, contrary to the data reader mapping above, it would be fully type safe (unless you employed reflection, of course).

4. Example: Read all accounts (i.e. a list) for a customer.

const string SELECT_ACCOUNTS_BY_CUSTOMER= “select * from vAccount where CustomerID=@CustomerID order by SubAccount”;

IList<Account> IAccountTable.GetAccounts(Guid customerId)
{
        IList<Account> list = new List<Account>();
        Account a;
        using (SqlConnection con = OpenConnection())
        {
                using (SqlCommand command = new SqlCommand(SELECT_ACCOUNTS_BY_CUSTOMER, con))
                {
                        AddParameter(command, “CustomerID”, customerId);

                        using (SqlDataReader reader = command.ExecuteReader())
                        {
                                while (reader.Read())
                                {
                                        a = FillAccount(reader);
                                        list.Add(a);
                                }
                        }
                }
        }
        return list;
}

private static Account FillAccount(SqlDataReader reader)
{
        Account a = new Account();
        string name;
        for (int i = 0; i < reader.FieldCount; ++i)
        {
                name = reader.GetName(i).ToLower();
                switch (name)
                {
                        case “id”: a.Id = reader.GetGuid(i); break;
                        case “mainaccount”: a.MainAccount = reader.GetInt32(i); break;
                        case “subaccount”: a.SubAccount = reader.GetInt32(i); break;
                        case “balance”: a.Balance = reader.GetDecimal(i); break;
                }
        }
        return a;
}

With LINQ:

IList<AccountLinq> GetAccountsLinq(Guid customerId)
{
     DataClassesDataContext dc = new DataClassesDataContext();
     var accountsList = from aa in dc.AccountLinqs where aa.CustomerID == customerId orderby aa.SubAccount select aa;
     return accountsList.ToList();
}

The same LOC relation as above. Mapping the generated entities may be a little more effort, but still far less than in the classical version. And since I called ToList() I didn’t even have to change my interfaces. With separate entities one would have to traverse the list and translate each entry. But again, no big deal and less work than without LINQ.

Conclusion:

The rate of code reduction with LINQ to SQL is even higher than with LINQ to Objects. This is something that’s very hard to ignore.

On the other hand, there is the architectural issue to be solved. In my opinion the biggest obstacle with LINQ to SQL is the fact that all code, entities and data context, are created in one big source file, thus I cannot separate them. I cannot put the entities in an entity assembly as mandated by the architecture without also putting the data context class into that assembly as well, exposing the database to anyone unduly interested. The alternative would be to have a second set of entities, one for the data access and one to flow across the layers. And respective mapping code. Doable as I checked, but not a nice prospect.

BTW, this time I used nearly every aspect of LINQ: The code contained Query Syntax, Extension Methods, Lambdas, Object Initializers, and Type Inference. And with System.Data.Linq a library is used that heavily relies on Expression Trees. Anonymous Types is the only missing part.

Final judgement:

All in all, LINQ proved to be such a useful addition to the developers toolbox that it cannot be ignored, even if it has its own problems. And while it is often seen as “better SQL”, this is already shortsighted. My earlier posts as well as all the fuss about functional programming in general should be indication enough that this is just where it begins.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

April 13, 2008

LINQ to Objects Reality Check

Filed under: .NET, .NET Framework, C#, LINQ, Software Development — ajdotnet @ 2:38 pm

The following post is actually a translation of a post I wrote last year for our internal company blog. It’s already gained some moss, so to speak (LINQ is released; isn’t there some new technology to blog about? ;-) ), but it may still serve as motivation for those who have successfully avoided the LINQ hype so far. And I kind of need it as introduction to some other posts I have in mind.

Here we go…

To me, LINQ seemed to be quite interesting, but until recently it was exactly that: a semblance — and one that has to be proved. I didn’t want another bunch of test code showing the exact problems that LINQ was built to solve and LINQ (surprise!) solving those problems. I wanted to see how LINQ fared in a real world scenario, i.e. based on existing code not built with LINQ in mind.

I have a little prototype application at my disposal. Not exactly a productive application, but near enough for my purpose. It’s a simple 3tier web application, as outlined in an earlier post.

My experiment took three steps:

  1. Migration to VS2008 und .NET 3.5
  2. Employment of LINQ to Objects
  3. Employment of LINQ to SQL

Here are my experiences — the LINQ to Objects part in this post, the LINQ to SQL part in the next one. Valuations are purely subjective ;-)

The migration ran smoothly enough, unit test assemblies had to migrated manually and code analysis spit out a lot of new warnings (which may be an issue in bigger projects and if you like “warnings as errors”). Otherwise the application ran without problem.

Employment of LINQ to Objects

LINQ to Objects is only a using statement away. All you have to do is adding the following line (and the respective assembly reference, if not done already):

using System.Linq;

Afterwards I scanned the existing code for rewrite candidates. You don’t need to understand the gory details of the following samples, the important thing is the relation between the non-LINQ and the LINQ versions. The relevant parts of the samples are in bold.

1. Example: A code fragment that reads all accounts of a customer and calculates the balance.

Customer c= ….
IList<Account> al = AccountTable.GetAccounts(c.Id);

c.TotalBalance = 0;
foreach (Account a in al)
     c.TotalBalance += a.Balance;

The rewrite with LINQ:

Customer c= …
IList<Account> al = AccountTable.GetAccounts(c.Id);

c.TotalBalance = al.Sum(a => a.Balance);

Effect: Calculating the sum is down from 3 lines of code to one line of code — and quite a bit more readable as well.

2. Example: Find a sub account in a sub account list.

IList<Account> list1 = AccountTable.GetAccounts(c.Id);

Account a= null;
foreach (Account a1 in list1)
{
     if (a1.SubAccount == subAccount)
     {
         a = a1;
         break;
     }
}

Another “classical” version could have been provided via an anonymous delegate, though only available with the class List<>, not the interface IList<>:

IList<Account> list1 = AccountTable.GetAccounts(c.Id);

List<Account> list2 = list1 as List<Account>;
a = list2.Find(
     delegate(Account test)
     {
         return (test.SubAccount == subAccount);
     });

And the LINQ version:

IList<Account> list1 = AccountTable.GetAccounts(c.Id);

Account a = list1.First(test => (test.SubAccount == subAccount));

The effect: LOC down from 9 (6 with delegate) to 1. And more readable. And applicable on the interface (rather than the List<> class).

Conclusion:

In the cases shown above, LINQ to Object provides way more efficient code, less code, better readable code. The cases were all related to list operations. And this is by no means tied in any way to database operations, i.e. any code may profit from this. On the other hand, I didn’t find any other use cases beside list operations in the existing code.

At a closer look only a subset of the LINQ features were involved: Extension methods, lambdas, and the System.Linq library. This is not an issue, just stating a fact. It’s not that the query syntax and other features aren’t available without database, it’s just that I didn’t need them.

End of part 1, the short and easy on, actually. But very convincing so far. The next post will look into LINQ to SQL.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

March 21, 2008

Using Extension Methods for Fluent Interfaces

Filed under: .NET, .NET Framework, C#, LINQ, Software Developers, Software Development — ajdotnet @ 4:53 pm

The last post was a bit dry; this time let’s do some coding first and add the justification afterwards.

I now and than have to create XML within code. Just to have an example to work on, take this configuration file fragment:

<?xml version=“1.0″ encoding=“utf-8″ ?>
<configuration>
    <appSettings>
       
<add key=“email” value=“nospam@mysite.com”/>
        <add key=“email2″ value=“nospam@othersite.com”/>
    </appSettings>
</configuration>

To create this little XML on the fly, using plain .NET BCL means, the code would look like this:

private static XmlDocument CreateAppConfig_Manual()
{
    XmlDocument doc = new XmlDocument();
    doc.AppendChild(doc.CreateXmlDeclaration(“1.0″, Encoding.UTF8.BodyName, “yes”));
   
    XmlElement eConf = doc.CreateElement(“configuration”);
    doc.AppendChild(eConf);
    XmlElement eSettings = doc.CreateElement(“appSettings”);
    eConf.AppendChild(eSettings);
   
    XmlElement eAdd = doc.CreateElement(“add”);
    eSettings.AppendChild(eAdd);
    eAdd.SetAttribute(“key”, “email”);
    eAdd.SetAttribute(“value”, “nospam@mysite.com”);
   
    eAdd = doc.CreateElement(“add”);
    eSettings.AppendChild(eAdd);
    eAdd.SetAttribute(“key”, “email2″);
    eAdd.SetAttribute(“value”, “nospam@othersite.com”);
   
    return doc;
}

It’s quite obvious that this C# method creates the XML above, is it not? Well, it does, as long as I did not forget to append that element over there after creating it. And appended it to the correct parent node. Which had to be created and appended to… which node?

Improving the code with helper methods

This is exactly the situation in which I usually come up with a little helper class:

public static class XmlHelper
{
    public static XmlDocument CreateXmlDocument(Encoding enc)
    {
        XmlDocument doc = new XmlDocument();
        doc.AppendChild(doc.CreateXmlDeclaration(“1.0″, enc.BodyName, “yes”));
        return doc;
    }
   
    public static XmlElement CreateElement(XmlDocument parent, string name)
    {
        XmlElement e = parent.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
   
    public static XmlElement CreateElement(XmlElement parent, string name)
    {
        XmlElement e = parent.OwnerDocument.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
}

Tiny little methods, not doing very much, thus not worth very much? On the contrary. (In reality I would add some parameter checks, but this is not my point right now.) Have a look at the polished up code:

private static XmlDocument CreateAppConfig_Helper()
{
    XmlDocument doc = XmlHelper.CreateXmlDocument(Encoding.UTF8);
    XmlElement eConf = XmlHelper.CreateElement(doc, “configuration”);
    XmlElement eSettings = XmlHelper.CreateElement(eConf, “appSettings”);
   
    XmlElement eAdd = XmlHelper.CreateElement(eSettings, “add”);
    eAdd.SetAttribute(“key”, “email”);
    eAdd.SetAttribute(“value”, “nospam@mysite.com”);
   
    eAdd = XmlHelper.CreateElement(eSettings, “add”);
    eAdd.SetAttribute(“key”, “email2″);
    eAdd.SetAttribute(“value”, “nospam@othersite.com”);
   
    return doc;
}

To me this looks way better. It’s shorter, better readable, and the code contains what I want to do, not how to do it.

Improving the code with extension methods

OK, plain helper methods were yesterday, today we have extension methods. But carefully, or you’ll step into the same trap as I did!

I started doing it naively by adding the “this” keyword to all first parameters and changing the calls accordingly. It compiled smoothly, ran smoothly… and produced an empty document. Why? Because the CreateElement(this XmlDocument) collides with the existing method XmlDocument.CreateElement(). And neither intellisense, nor the compiler, nor the runtime bothered to tell me.

Lesson learned: Be careful with extension method names!

Anyway, all this would change would be two calls to CreateElement() which is not worth the effort. Actually the thing that bothers me the most with the above code is all those temporary variables. But the reason I need them is (only) that I need to apply several calls on them. Enter Fluent Interfaces!

Improving the code with fluent interfaces

The very idea of fluent interfaces is that one method that would otherwise be declared as void, returns a reference that can instantly be used for the next call. So, what do you think about this little beauty:

private static XmlDocument CreateAppConfig_Fluent()
{
    return XmlHelper.CreateXmlDocument(Encoding.UTF8)
        .NewElement(“configuration”)
            .NewElement(“appSettings”)
                .NewElement(“add”)
                    .SetAttrib(“key”, “email”)           // name collsion
                    .SetAttrib(“value”, “nospam@mysite.com”)
                    .Parent()                                    // type safe version of ParentNode
                .NewElement(“add”)
                    .SetAttrib(“key”, “email2″)
                    .SetAttrib(“value”, “nospam@othersite.com”)
                    .OwnerDocument;
}

No more unnecessary temporaries, and way more readable as the code indentation even reflects the XML structure.

Of course I had to provide respective extension methods to allow this method to be written:

  • I changed the CreateElement() extension methods to NewElement() to avoid the name collision. I had another name clash with the SetAttribute() method, which I needed to provide because… 
  • every method had to return a value (to pass the reference along).
  • In order not to “break the chain” I had to add methods to “navigate” in cases were the returned value was not the one to work on afterwards, e.g. I had to provide the Parent() method.
  • And finally I needed the correct return type, i.e. the most derived type, in order to apply the specific extension methods. This was why I could not use XmlNode.ParentNode for “navigation”. It has the wrong type of XmlNode rather than the required XmlElement.

Here’s the respective extension class:

public static class XmlHelper
{
    public static XmlElement NewElement(this XmlDocument parent, string name)
    {
        XmlElement e = parent.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
   
    public static XmlElement NewElement(this XmlElement parent, string name)
    {
        XmlElement e = parent.OwnerDocument.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
   
    public static XmlElement SetAttrib(this XmlElement element, string name, string value)
    {
        element.SetAttribute(name, value);
        return element;
    }
   
    public static XmlElement Parent(this XmlElement element)
    {
        return (XmlElement)element.ParentNode;
    }
}

See how every method but one on one hand extends XmlElement and at the same time returns an XmlElement? This could probably be extended to XmlAttribute and other related classes. As long as I can build a chain of returned type used as subsequently extended type the fluent interface idea should work.

But I never liked fluent interfaces…

Fluent interfaces have been coined by Marin Fowler in his blog, december 2005, more information on the term may be found at wikipedia. He did however no exactly invent the idea, he merely was taking an C++ idiom a little further. I remember having returned *this in C++ in the 90s, and I certainly did not invent it then.

However, I also grew very wary of this idiom, and with C# I became an advocate against “call chains” (where every return value is instantly used to call the next method) in general, and fluent interfaces in particular. For just one simple reason: NullReferenceException. If one appeared, who could tell which particular call was the culprit, and why. With single calls, each in its own source line, you do know from the call stack and debug information which call failed. And running in a debugger you instantly had the temporary variables to tell you the current state of affairs.

Thus I concluded that fluent interfaces lack robustness. Then came LINQ.

This example is taken from an early introduction by none other than Don Box and Anders Hejlsberg:

IEnumerable<string> query = from s in names where s.Length == 5 orderby s select s.ToUpper();

translates to:

IEnumerable<string> query = names.Where(s => s.Length == 5).OrderBy(s => s).Select(s => s.ToUpper());

If that is not what I call a “call chain”, I don’t know what is. And it certainly is a very nice application of fluent interfaces. Thus it should be subject to the concerns I had. It is not. And not just because I wouldn’t dare to contradict Don or Anders (oh boy, I would have a field day if I could do that).

The reason I have no problem with LINQ is that it does not only employ fluent interface programming style, it is also backed up by functional programming concepts. And functional programming demands that methods do not return NULL values, but null objects. (Ironically we had that back in the old C++ days, as C++ references are not supposed to be null in the first place, contrary to pointers.)

Apart from this theoretical justification, extension methods are static methods that can check the “this” reference and throw ArgumentExceptions that tell exactly what call was wrong. Something that is not possible with instance methods.

So, if you spice fluent interfaces up with some functional pepper, they become a very digestible meal.

Wrap up

Fluent interfaces provide a compelling concept that has been used in isolated spots, such as with StringBuilder. With LINQ it is also backed up with functional programming concepts and in turn becomes a more broadly usable concept. While LINQ itself is a great example of the employment of fluent interfaces, extension methods can be used to implement fluent interfaces in more general cases.

Does that mean I need extension methods to implement fluent interfaces? It does not! When I had already written much of this post I stumbled over “XmlDocument fluent interface“. Marks approach was a little different. Rather than extending the existing XmlElement class, he returned his own tool class. This way he could avoid the naming collisions and restrict the callable methods to just the fluent interface and nothing else. And he does not use extension methods.

My example shouldn’t be taken too literally, as it was only meant to highlight the concept, not to provide a robust implementation. It should at least have shown that fluent interfaces are a feasibly thing to do with extension methods. In many cases, all it takes is “return x;” Without extension methods, this would not have been possible.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

March 16, 2008

Using extension methods for decomposition

Filed under: .NET, .NET Framework, C#, LINQ, Software Developers, Software Development — ajdotnet @ 12:18 pm

My last post dug into some technical details of extension methods. Now I’ll have a look at what extension methods can do for me; more to the point, whether extension methods may support new patterns or idioms. Occasional hints in other blogs and articles as well as my own findings revealed two general patterns:

  • Using extension methods for decomposition or separation
  • Using extension methods to support fluent interfaces

I’ll start with the first one because I think it may address a variety of issues and ideas. Fluent interfaces will be covered in another post.

Let’s go through some use cases.

Interface based coding

Did you ever sort a collection? What method did you use? Array.Sort()? List.Sort()? Why not IList.Sort()? Because there is no IList.Sort()!

One may argue that Microsoft simply forgot to put Sort() into the IList interface or to provide an ISortedList interface. But what about Project(), RemoveInvalidValues(), Normalize()? Matter of fact, it is simply not possible to foresee every demand and not feasibly to put it into general purpose interfaces.

The price to pay is that implementing sorting in the list classes sacrifices interface based abstraction and binds the calling code unnecessarily to implementation classes. But what other options did we have anyway?

The sorting algorithm could easily be implemented working on IList, independent of the implementation class. Yet the method would have to be put somewhere, usually in some helper class (like ListHelper.Sort(IList)). Microsoft did not do that because it violates ease of use principles outlined in “Framework Design Guidelines” (indeed a very valid reason).

However, since .NET 3.5 we can easily solve that problem by declaring our static Sort method as extension method, making it appear as if it is a method of the interface:

void ListHelper.Sort(this IList) {…}

OK, one step back! Doesn’t that violate OO principles? Shouldn’t the the list be able to sort itself? Just like a deck of cards? Well, last time I played cards I had to sort … I mean shuffle them. Sorting and shuffling are algorithms working on lists, not intrinsic features of said lists.

Thus we have a possible idiom:

  1. Provide interfaces that describe core characteristics of data structures (such as IEnumerable, IList, ISerializable, …).
  2. Put methods that implement algorithms working on core characteristics in extension classes for the respective interface.

This helps keeping interfaces straight, and it improves reuse of algorithms.

Segregate context specific service methods

Take an entity class, say Address. There are certain things one could do with an address, e.g. provide a formatted string for printing, checking if it is a valid address, and so on. I said “do with” not “do to”; again I’m talking about algorithms working on the address, rather than manipulating it. And they don’t need access to the inner workings of the object.

Still we could implement the methods within the address class; the motivation for the previous pattern was reuse of the algorithm, but how many addresses are we going to have?

So let’s put the Validate() method (used by the business logic) into the entity, the FormatForPrinting() Method (used by the UI), the GenerateUniqueId() Method (used by the data access layer), … damn, that format method produced HTML but this client needs XAML… . And is this IsReadOnly() method for the UI or the data access layer?
Got the idea? There is reason not to put those methods into the entity class: they do not only depend on the entity but also belong to their specific context, in this case a certain layer.

Putting those methods into a helper class could have been done before, but again, how many helper classes shall we have to remember? Extension methods can solve that particular problem.

So, I have another idiom:

  1. Keep context dependent methods out of classes which are used in different contexts.
  2. Put those context specific methods into context specific extension classes.

Doing this, the entity class stays clean and pure, no undue features, no pesky helper methods, only nice little properties.

Separation of core implementation and service methods

Take one of my favorite classes, a factory class. It probably provides a CreateObject(…) method taking certain parameters that does the main job of creating objects. Of course it would take the maximum of available parameters, and of course it would have certain overloads to make things easier to use:

object CreateObject(Type type, params object[] args);
// Service methods:
T CreateObject<T>(params object[] args);
object CreateObject(Type type);
T CreateObject<T>();
object CreateObject(string typeName, params object[] args);
object CreateObject(string typeName);

I wouldn’t have problems to further add type safe overloads. What’s the result? While there is just one relevant method, a bunch of additional method complicates the class, broadens its surface, needs to be tested, may do things a little differently… and at the end the overload I just needed would still be missing.

Extension methods solve this problem in two aspects: First they allow moving the convenience methods out of the factory class without sacrificing the usability from the callers point of view. Secondly, I can add my special incarnation of a CreateObject method in my private extension class and it would seamlessly appear beside (or even replace) all the other methods.

So, I have another idiom:

  1. Keep the implementation of classes to the core functionality.
  2. Put additional convenience methods into extension classes.

Mixin classes

The idea of using extension methods for mixins was introduced with Create Mixins with Interfaces and Extension Methods. This article uses extension methods working on a special interface, the combination of them forming the mixin. A class that needs some orthogonal functionality would simply implement that interface and benefit from the provided implementation.

The idiom described looks like:

If you have an algorithm or feature you want to apply in an egalitarian way to different classes…

  1. use an interface to provide uniform access to the classes…
  2. and provide the algorithm as extension method to that interface.

I may have seen those two points before…?!

As a side note: Wikipedia defines mixins as follows:

“[...] a mixin is a class that provides a certain functionality to be inherited by a subclass, but is not meant to stand alone. Inheriting from a mixin is not a form of specialization but is rather a means to collect functionality.” http://en.wikipedia.org/wiki/Mixin

Seen from a C++ point of view, the mixin approach relies on several tenets: Based on inheritance of functionality, the mixin class is a fully fledged class that provides actual code (as opposed to interfaces). This quite often included code placed in constructors and the destructor, thus implicitly adding behavior. At the same time, the reuse by inheritance approach does not interfere with the regular inheritance hierarchy, as C++ supports multiple inheritance. And it also does not interfere with the interface of the class (i.e. it can be realized that way in case it is undesired), as C++ allows protected and private inheritance.

Of course this is not possible with .NET, as multiple inheritance is restricted to interfaces and always public. Thus I am inclined to call the mentioned approach a very poor realization of mixins. It falls short of covering any aspect I used mixins for with C++: I don’t inherit functionality, say by providing default implementations. I don’t add behavior, as no logic is changed until I explicitly call the extension method. And I cannot even dream of making this an implementation detail of the class, as it solely relies on changing the public interface of that class.

Putting things together…

These usage scenarios may have provided different motivations, but in essence they have followed the same ideas and led to similar idioms. In a more generalized form:

  1. Keep methods out of your class or interface …
    • if they implement algorithms working on the class rather than features of the class
    • if they depend not only on the class/interface, but also on the context
    • if they merely add convenience overloads
    • if they would not have been available in the first place
  2. Provide extension classes …
    • that capture algorithms working on objects and especially implementations working on interfaces
    • that cluster context specific features
    • that add shortcuts, convenience methods, and helper methods.

Now, all this could have been done before with static helper classes. And having extension methods at our disposal doesn’t change the work on the providers side at all. The big difference however is at the users side. While the classes he works with remain simpler and more concise at first glance, just by adding a using statement he can dynamically enhance those classes with exactly the bunch of features he just needs. This is almost like dynamically extending class definitions in dynamic languages.

Putting it together, this has (or rather may have) several effects:

  • It supports the information hiding principle for abstract data types as only the absolutely necessary methods get access to the inner workings.
  • It elevates separation of concerns as classes and algorithms are separated.
  • It improves cohesion as context dependent features are moved to their respective context.
  • Reusability should improve if the extension methods can be implemented to work against interfaces and thus become more applicable.
  • Testing those algorithms should improve as those interfaces may also be implemented by mock objects.
  • Testing the classes will be easier due to the smaller and more concise surface to test.
  • And finally classes and interfaces that have a smaller surface are less complex and easier to keep consistent.

If you ask me, I didn’t expect such a compelling list of benefits when I started locking into this topic. Whether they may be realized remains to be seen, but after having worked on this post, I am willing to give it a try.

That is, as long as nobody points out any flaws I overlooked. (OK guys, that’s a challenge!)

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

February 10, 2008

Extensively using extension methods

Filed under: .NET, .NET Framework, C#, LINQ, Software Developers, Software Development — ajdotnet @ 3:00 pm

What was meant as a single post now seems to become another little series: I have started looking at the new C# language features (introduced with/for LINQ) in isolation, trying to assess whether they can live up to being features in their own right.

I have covered partial methods (strictly speaking not a LINQ related feature). Anastasios has written a blog post about anonymous types, so I’ll cover that aspect simply by referral.

Now for extension methods…

If you need a recap on extension methods, those links might suit you:

Now let’s look at a few characteristics and consequences; nothing special, most of it should be obvious anyway.

Basic usage

Example: Check if an array is empty…

public static class ArrayExtensions
{
    public static bool IsEmpty(this Array array)
    {
        return (array==null) || (array.Length==0);
    }
}

Now look at this code using the extension method:

int[] ia = new int[] { 1, 2, 3 };
if (ia.IsEmpty())
    Console.WriteLine(“ia is empty”);

The thing to note is that a property IsEmpty would be so much nicer. Alas, extension methods — at least in the current implementation — are limited to plain instance methods.

Let’s stress that:

public static class ArrayExtensions
{
    public static int[] CreateNumbers(this Array array, int start, int end)
    {
        return new int[] { start, end }; // if you find that peculiar, you’re right ;-)
    }
}

This is a factory method, a method that creates an instance. It would be so nice to call it on the type, e.g.

int[] ib = (int[]).CreateNumbers(1, 5);

And yet, this would require static extension methods. Something that has been discussed in a forum thread, well, rebuffed would be a better description… :-( . I also could find uses for “extension const values” but there’s no sense in starting that discussion.

Another piece of code calling into extension methods:

int[] ic = null;
if (ic.IsEmpty())
    Console.WriteLine(“ic is empty”);
if (ArrayExtensions.IsEmpty(ic))
    Console.WriteLine(“ic is empty”);

There are a few things to point out:

  • Look at the second line in the above code. This code runs nicely, no NullReferenceException! Despite the fact that extension methods are called via instances, the “this” argument may still be null and the extension method should make sure it can handle that (thus the array==null check above).
  • Look again at the second line. Partly a consequence of the previous point, there is no way to know whether IsEmpty() is a call to an ordinary instance method (which would certainly throw an exception) or to an extension method (which would not). Thus you also cannot decide whether this is an erroneous code fragment or not.
  • Now look at the second “if” statement. It still is possible to call extension methods like you would with ordinary static helper methods.

What I make out of these findings is that instance methods and extension methods may look the same at the calling code fragment, yet there is a semantic difference (i.e. different behavior at runtime) one should be be aware of. And the fact that this difference is not apparent in the calling code somewhat blurs the code if you look at it from a reviewers point of view.

Ambiguity

Let’s count some numbers:

public static class ArrayExtensions
{
    public static int GetSafeCount(this Array array)
    {
        // allways check null value!
        return (array == null) ? 0 : array.Length;
    }
}

public static class CollectionExtensions
{
    public static int GetSafeCount<T>(this ICollection<T> col)
    {
        // allways check null value!
        return (col == null) ? 0 : col.Count;
    }
}

Suppose you import the first class, then you write the following code:

if (ia.GetSafeCount() == 0)
    Console.WriteLine(“ia is empty”);

After you import the second class, the compiler will greet you with the following message:

D:\…\BasicTests.cs(68,17): error CS0121: The call is ambiguous between the following methods or properties: ‘Extensions.ArrayExtensions.GetSafeCount(System.Array)’ and ‘Extensions.CollectionExtensions.GetSafeCount<int>(System.Collections.Generic.ICollection<int>)’

The bad message is: Just by importing another namespace you add an ambiguity. This is nothing new, if you happened to code against Control with using System.Web.UI; and you added using System.Windows.Forms; you would run into the same problem. With two notable differences: One, this ambiguity is on class level, not on method level. Two, you could circumvent it with a using WebControl = System.Web.UI.Control; directive. This is not possible with extension methods.

The good part is that the compiler will whine until you made your intentions clear. Or does it? Look at this:

public class Test1
{
    public void Foo()
    {
        this.Test();
    }
}

public class Test2 : Test1
{
}

public static class TestExtensions
{
    public static void Test(this Test1 test)
    {
        Console.WriteLine(“Test 1 called!”);
    }
}

One class derived from another, an extension method for the base class. This is valid code and of course it calls the extension method:

Test2 t2 = null;
t2.Test();

Now import another extension method:

public static void Test(this Test2 test)
{
    Console.WriteLine(“Test 2 called!”);
}

Guess what? For the call to Test() in the above example intellisense will show both methods — without any way to distinguish them. And the compiler will not complain, yet at runtime the new extension method will be called, because the type matches better. Thus, by simply adding a using directive you have changed the semantics of your code radically, calling different methods and perhaps not even noticing it.

BTW: In case you wonder what the Foo() method in Test1 is for: I just wanted to stress the fact that a class can call extension methods aimed at itself. All it needs to do is invoke them via the this pointer.

Importing extension methods

I already talked a little about importing extension methods. But there’s something that should be made very clear:

You don’t import some extension methods. You don’t import some classes holding extension methods. You import all extensions methods in all classes in a particular namespace.

Lemma 1: It is not possible to import single extension methods.
Lemma 2: It is not possible to import a single class carrying extension methods. In other words, the following using directive does not import the extension methods in that particular class:

// introducing only one particular class’s extensions won’t work
using ImportExtensions = Extensions.ArrayExtensions;
// we have to introduce all extension methods in all classes in that namespace
using Extensions;

Lemma 3: It is not possible to import a namespace without importing the extension methods as well.

As a consequence it may be wise to keep extension classes in separate namespaces. More to the point, don’t put extension methods for common classes (such as Object, String, or Control) in commonly used namespaces. It may even be a good idea — considering that the class containing the extension methods is of less meaning than usually — to put extension methods for certain purposes in distinct namespaces, i.e. use the last namespace part in a way you would normally use a class.

Reflection

The obvious part regarding reflection is that the class being extended does no know anything about the extension methods. Calling typeof(MyClass).GetMethods() will always return the same result, always excluding the extension methods. So it serves one to think about extension methods as a pure compiler feature that never makes it to IL.

This is however only partly true and only part of the truth…

First the truth: The calling IL code is identical for extension method calls and simple static method calls:

L_006b: call bool Extensions.ArrayExtensions::IsEmpty(class [mscorlib]System.Array)

L_0085: call bool Extensions.ArrayExtensions::IsEmptyNonExt(class [mscorlib]System.Array)

Now the untrue part: There actually is some difference, otherwise Reflector would not be able to decompile the following fragment:

if (ic.IsEmpty())
    Console.WriteLine(“ic is empty”);
if (ic.IsEmpty())
    Console.WriteLine(“ic is empty”);
if (ArrayExtensions.IsEmptyNonExt(ic))
    Console.WriteLine(“ic is empty”);

So there has to be some information about extension methods in the IL code; however not at the calling side, for otherwise Reflector would have presented the second call correctly as the plan static call that was in the original code.

The answer lies in the missing part: The extension method itself is marked as such, using an attribute. This is the IL (as Reflector shows it):

.method public hidebysig static bool IsEmpty(class [mscorlib]System.Array array) cil managed 
{ 
    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() 
    .custom instance void [mscorlib]System.Diagnostics.DebuggerStepThroughAttribute::.ctor() 
    .maxstack 2

I put the DebuggerStepThroughAttribute in for a reason. The same method presented by Reflector as C# code looks like this:

[DebuggerStepThrough] 
public static bool IsEmpty(this Array array)

As you can see, Reflector silently replaces the ExtensionAttribute with the this keyword.

There is another interesting thing about the attribute. If you look it up in the documentation it states the usual “This API supports the .NET Framework infrastructure and is not intended…”, but it also tells us that it targets not only methods, but classes and assemblies likewise. Consequently my test class contains the attribute (again, only visible in IL view in Reflector):

.class public abstract auto ansi sealed beforefieldinit ArrayExtensions 
    extends [mscorlib]System.Object 
{ 
    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor()

As well as the assembly (and hey! this time in C# view):

// Assembly ConsoleApplication1, Version 1.0.0.0    

[assembly: Extension]

Best guess is that this is for faster lookup of extension methods (or rather ruling their presence out).

LINQ

I guess I cannot avoid to talking about the relationship between LINQ and extension methods. Have a look at that code:

int[] ia = new int[] { 1, 2, 3 };
var result = from num in ia where num < 2 select num;

In a sense LINQ is all about translating query keywords into “lower level” language features. Reflector presents the relevant part of this code as follows:

CS$<>9__CachedAnonymousMethodDelegate1 = delegate (int num) { return num < 2; };
IEnumerable<int> result =
ia.Where<int>(CS$<>9__CachedAnonymousMethodDelegate1);

As you can see, the where keyword is translated to a call to the extension method Where (declared in System.Linq.Enumerable).
(Note: If you look at that method in Reflector it appears to be empty, just like all extension methods in that class. I’ll leave that riddle to someone else…)

Verdict

Extension methods are no doubt an interesting new language feature. They have their pitfalls, but which feature does not. My main concern is that the IDE doesn’t tell me which extension method I’m calling. Tooltips as well as Intellisense tell me a method is an extension method, but not in which namespace/class it resides.

On the other hand it’s quite obvious that extension methods are not there as a feature in its own right but as a means to support LINQ. It actually does support LINQ very well — but it does not go beyond. Had it been otherwise, there would be means to at least some of those areas in which class extension methods currently fall short:

  • Extension properties
  • Static extension methods
  • Better ways for controlling the import of extension methods (e.g. import classes rather than namespaces)

OK, that’s it. Seems I’m not capable of writing short posts…

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

September 4, 2007

partial methods for partial developers

Filed under: .NET, .NET Framework, C#, Design Time, LINQ, Software Developers — ajdotnet @ 8:19 pm

The evolution of C# has all but stopped. One addition with .NET 2.0 was partial classes. Probably meant to ease the usage of code generators (keeping generated code separate from hand written code) the common developer is not limited to “consuming” it, he may also actively employ it. E.g. it might serve a puprose to separate production code from debug code. With .NET 3.5 there already was much talk about the language additions around LINQ, such as … well, just have a look at The LINQ Project or ScottGu’s Blog. And of course, these features are meant to be actively employed by the common developer.

Now we have a relatively new addition: Partial methods. (In case you haven’t hread, see In Case You Haven’t Heard, see Partial Methods for the VB.NET camp). With this addition I’m not so sure it’s meant for me… .

For the record: A partial method is a method declared in a partial class (hence the name). It may or may not be implemented in the same class (usually in another file). If it is implemented, the compiler will emit usual code. If it is not implemented, the compiler will emit nothing, not even metadata about that method. And will will also wipe out any calls to the method.

Ergo: Partial methods are purely a compiler feature. You need the code of the declaring class and it has to be prepared with declarations to support partial method implementation.

The usual (probably the only relevant) use case for partial methods is light weight event handling in combination with code generation. Say, you parse a database or an XML file and generate data classes. Tens of classes, hundreds of properties, all fairly generic, boilerplate stuff. And also fairly common is the need to interfere with this method for validation or that property setter to update another one. So instead of manipulating the generated code or having hundreds of events and thousands of method calls during runtime (of which in most cases only a tiny fraction will actually do something) partial methods kick in. They are easier to declare and use than events (pay for what you need) and if not used they simply vanish (pay as you go).

Any other relevant use case? None that I am aware of. In other words we have a language feature that is very efficient for a very limited number of use cases.

For whom are partial methods made?

Implementors of code generators are the ones to employ partial methods. Today this means Microsoft and LINQ related stuff, tomorrow Microsoft may decide to use partial methods in other event laden environments, such as ASP.NET and WinForms. In these cases the common developer only consumes partial methods. Let me re-phrase this: The common developer has to consume partial methods. Why? With partial methods the code generator will certainly not generate any other means of extension, such as events or virtual methods. The very purpose of partial methods is to do away with these heavy weight means, right?

The pro: If you follow the designer driven, RAD style, code based development style that ASP.NET or WinForms used for quite some time, the designer will eventually handle partial methods transparently. The only difference for the common developer is that he knows the stuff he is doing will be more efficient at runtime.

The con: If you like metadata driven applications (and already struggled with ASP.NET databinding because there is no way to attach meta data)… well, prepare for some further loss of information. If you need some event for any property setter (for tracing or tracking), if you need any kind of dynamic behaviour (e.g. to attach some kind of generic rules engine),  … let’s just hope the developer of the code generator anticipated that use case (or prepare for some work). You have a clean layer separation and the entities should know nothing about the UI? Well, put your code riht into the data class will spoil that. But hey, you might use partial methods to implement real events. Manually.

So you and I, the common developer that is, will not gain very much from this use case. But we will become more dependend on the ability, the mercy, and the foresight of the developers of the code generators and designers.

Will we be able to employ partial methods (rather than only consuming them)? Let’s see… when did I last write a class that had to support a vast amount of events? (That actually occasionally happens when I lay some infrastructure for the next project or work on framework code.) A class that was at the same time not intended to act as base class? (OK, forget about the infastructure!). But I surely wrote code by hand (because there was no code generator) that looks exactly like a candidate? But then I simply wrote the methods I needed, as I needed them — no need for a partial declaration.

So, unless I enter the camp of the not-so-common-code-generator-writing-developers (it happens, but only rarely), I can see no relevant use case that allows me to employ partial methods. (I really don’t count the example of increasing the readability conditional compilation — as presented in the VB post above — as relevant for a new language feature.)

Again, for whom are partial methods made? In my opinion they are made for Microsoft. To help them writing code generators that generate more efficient code. Code that conforms with the constant shift from object oriented (inheritance and virtual methods), component oriented (interfaces, events), metadata driven (attributes) development to a more and more exclusively used code generation approach. Highly efficient, but really not meant for me.

Do I have a better solution for the problem partial methods solve? I don’t. Therefore I can’t blame Microsoft for putting them in. Do I have concerns about how that will affect my work? I certainly do. Therefore I do hope the developers employing them do it with utmost caution. And with the awareness that not everyone uses their tools in a point-and-click kind of fashion.

There already has been some concern about partial methods in the .NET community — and for other reasons than the ones I mentioned: Language complexity, naming issues, other features higher on the wish list, and so on. I recommend reading the comments of the post above if you want to keep up with that. Whether partial methods are a good idea or not, they are easily the single most controversial language feature in C# so far.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

The Shocking Blue Green Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 244 other followers