AJ's blog

March 21, 2008

Using Extension Methods for Fluent Interfaces

Filed under: .NET, .NET Framework, C#, LINQ, Software Developers, Software Development — ajdotnet @ 4:53 pm

The last post was a bit dry; this time let’s do some coding first and add the justification afterwards.

I now and than have to create XML within code. Just to have an example to work on, take this configuration file fragment:

<?xml version=“1.0” encoding=“utf-8” ?>
<configuration>
    <appSettings>
       
<add key=“email” value=“nospam@mysite.com”/>
        <add key=“email2” value=“nospam@othersite.com”/>
    </appSettings>
</configuration>

To create this little XML on the fly, using plain .NET BCL means, the code would look like this:

private static XmlDocument CreateAppConfig_Manual()
{
    XmlDocument doc = new XmlDocument();
    doc.AppendChild(doc.CreateXmlDeclaration(“1.0”, Encoding.UTF8.BodyName, “yes”));
   
    XmlElement eConf = doc.CreateElement(“configuration”);
    doc.AppendChild(eConf);
    XmlElement eSettings = doc.CreateElement(“appSettings”);
    eConf.AppendChild(eSettings);
   
    XmlElement eAdd = doc.CreateElement(“add”);
    eSettings.AppendChild(eAdd);
    eAdd.SetAttribute(“key”, “email”);
    eAdd.SetAttribute(“value”, “nospam@mysite.com”);
   
    eAdd = doc.CreateElement(“add”);
    eSettings.AppendChild(eAdd);
    eAdd.SetAttribute(“key”, “email2”);
    eAdd.SetAttribute(“value”, “nospam@othersite.com”);
   
    return doc;
}

It’s quite obvious that this C# method creates the XML above, is it not? Well, it does, as long as I did not forget to append that element over there after creating it. And appended it to the correct parent node. Which had to be created and appended to… which node?

Improving the code with helper methods

This is exactly the situation in which I usually come up with a little helper class:

public static class XmlHelper
{
    public static XmlDocument CreateXmlDocument(Encoding enc)
    {
        XmlDocument doc = new XmlDocument();
        doc.AppendChild(doc.CreateXmlDeclaration(“1.0”, enc.BodyName, “yes”));
        return doc;
    }
   
    public static XmlElement CreateElement(XmlDocument parent, string name)
    {
        XmlElement e = parent.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
   
    public static XmlElement CreateElement(XmlElement parent, string name)
    {
        XmlElement e = parent.OwnerDocument.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
}

Tiny little methods, not doing very much, thus not worth very much? On the contrary. (In reality I would add some parameter checks, but this is not my point right now.) Have a look at the polished up code:

private static XmlDocument CreateAppConfig_Helper()
{
    XmlDocument doc = XmlHelper.CreateXmlDocument(Encoding.UTF8);
    XmlElement eConf = XmlHelper.CreateElement(doc, “configuration”);
    XmlElement eSettings = XmlHelper.CreateElement(eConf, “appSettings”);
   
    XmlElement eAdd = XmlHelper.CreateElement(eSettings, “add”);
    eAdd.SetAttribute(“key”, “email”);
    eAdd.SetAttribute(“value”, “nospam@mysite.com”);
   
    eAdd = XmlHelper.CreateElement(eSettings, “add”);
    eAdd.SetAttribute(“key”, “email2”);
    eAdd.SetAttribute(“value”, “nospam@othersite.com”);
   
    return doc;
}

To me this looks way better. It’s shorter, better readable, and the code contains what I want to do, not how to do it.

Improving the code with extension methods

OK, plain helper methods were yesterday, today we have extension methods. But carefully, or you’ll step into the same trap as I did!

I started doing it naively by adding the “this” keyword to all first parameters and changing the calls accordingly. It compiled smoothly, ran smoothly… and produced an empty document. Why? Because the CreateElement(this XmlDocument) collides with the existing method XmlDocument.CreateElement(). And neither intellisense, nor the compiler, nor the runtime bothered to tell me.

Lesson learned: Be careful with extension method names!

Anyway, all this would change would be two calls to CreateElement() which is not worth the effort. Actually the thing that bothers me the most with the above code is all those temporary variables. But the reason I need them is (only) that I need to apply several calls on them. Enter Fluent Interfaces!

Improving the code with fluent interfaces

The very idea of fluent interfaces is that one method that would otherwise be declared as void, returns a reference that can instantly be used for the next call. So, what do you think about this little beauty:

private static XmlDocument CreateAppConfig_Fluent()
{
    return XmlHelper.CreateXmlDocument(Encoding.UTF8)
        .NewElement(“configuration”)
            .NewElement(“appSettings”)
                .NewElement(“add”)
                    .SetAttrib(“key”, “email”)           // name collsion
                    .SetAttrib(“value”, “nospam@mysite.com”)
                    .Parent()                                    // type safe version of ParentNode
                .NewElement(“add”)
                    .SetAttrib(“key”, “email2”)
                    .SetAttrib(“value”, “nospam@othersite.com”)
                    .OwnerDocument;
}

No more unnecessary temporaries, and way more readable as the code indentation even reflects the XML structure.

Of course I had to provide respective extension methods to allow this method to be written:

  • I changed the CreateElement() extension methods to NewElement() to avoid the name collision. I had another name clash with the SetAttribute() method, which I needed to provide because… 
  • every method had to return a value (to pass the reference along).
  • In order not to “break the chain” I had to add methods to “navigate” in cases were the returned value was not the one to work on afterwards, e.g. I had to provide the Parent() method.
  • And finally I needed the correct return type, i.e. the most derived type, in order to apply the specific extension methods. This was why I could not use XmlNode.ParentNode for “navigation”. It has the wrong type of XmlNode rather than the required XmlElement.

Here’s the respective extension class:

public static class XmlHelper
{
    public static XmlElement NewElement(this XmlDocument parent, string name)
    {
        XmlElement e = parent.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
   
    public static XmlElement NewElement(this XmlElement parent, string name)
    {
        XmlElement e = parent.OwnerDocument.CreateElement(name);
        parent.AppendChild(e);
        return e;
    }
   
    public static XmlElement SetAttrib(this XmlElement element, string name, string value)
    {
        element.SetAttribute(name, value);
        return element;
    }
   
    public static XmlElement Parent(this XmlElement element)
    {
        return (XmlElement)element.ParentNode;
    }
}

See how every method but one on one hand extends XmlElement and at the same time returns an XmlElement? This could probably be extended to XmlAttribute and other related classes. As long as I can build a chain of returned type used as subsequently extended type the fluent interface idea should work.

But I never liked fluent interfaces…

Fluent interfaces have been coined by Marin Fowler in his blog, december 2005, more information on the term may be found at wikipedia. He did however no exactly invent the idea, he merely was taking an C++ idiom a little further. I remember having returned *this in C++ in the 90s, and I certainly did not invent it then.

However, I also grew very wary of this idiom, and with C# I became an advocate against “call chains” (where every return value is instantly used to call the next method) in general, and fluent interfaces in particular. For just one simple reason: NullReferenceException. If one appeared, who could tell which particular call was the culprit, and why. With single calls, each in its own source line, you do know from the call stack and debug information which call failed. And running in a debugger you instantly had the temporary variables to tell you the current state of affairs.

Thus I concluded that fluent interfaces lack robustness. Then came LINQ.

This example is taken from an early introduction by none other than Don Box and Anders Hejlsberg:

IEnumerable<string> query = from s in names where s.Length == 5 orderby s select s.ToUpper();

translates to:

IEnumerable<string> query = names.Where(s => s.Length == 5).OrderBy(s => s).Select(s => s.ToUpper());

If that is not what I call a “call chain”, I don’t know what is. And it certainly is a very nice application of fluent interfaces. Thus it should be subject to the concerns I had. It is not. And not just because I wouldn’t dare to contradict Don or Anders (oh boy, I would have a field day if I could do that).

The reason I have no problem with LINQ is that it does not only employ fluent interface programming style, it is also backed up by functional programming concepts. And functional programming demands that methods do not return NULL values, but null objects. (Ironically we had that back in the old C++ days, as C++ references are not supposed to be null in the first place, contrary to pointers.)

Apart from this theoretical justification, extension methods are static methods that can check the “this” reference and throw ArgumentExceptions that tell exactly what call was wrong. Something that is not possible with instance methods.

So, if you spice fluent interfaces up with some functional pepper, they become a very digestible meal.

Wrap up

Fluent interfaces provide a compelling concept that has been used in isolated spots, such as with StringBuilder. With LINQ it is also backed up with functional programming concepts and in turn becomes a more broadly usable concept. While LINQ itself is a great example of the employment of fluent interfaces, extension methods can be used to implement fluent interfaces in more general cases.

Does that mean I need extension methods to implement fluent interfaces? It does not! When I had already written much of this post I stumbled over “XmlDocument fluent interface“. Marks approach was a little different. Rather than extending the existing XmlElement class, he returned his own tool class. This way he could avoid the naming collisions and restrict the callable methods to just the fluent interface and nothing else. And he does not use extension methods.

My example shouldn’t be taken too literally, as it was only meant to highlight the concept, not to provide a robust implementation. It should at least have shown that fluent interfaces are a feasibly thing to do with extension methods. In many cases, all it takes is “return x;” Without extension methods, this would not have been possible.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

Advertisements

March 16, 2008

Using extension methods for decomposition

Filed under: .NET, .NET Framework, C#, LINQ, Software Developers, Software Development — ajdotnet @ 12:18 pm

My last post dug into some technical details of extension methods. Now I’ll have a look at what extension methods can do for me; more to the point, whether extension methods may support new patterns or idioms. Occasional hints in other blogs and articles as well as my own findings revealed two general patterns:

  • Using extension methods for decomposition or separation
  • Using extension methods to support fluent interfaces

I’ll start with the first one because I think it may address a variety of issues and ideas. Fluent interfaces will be covered in another post.

Let’s go through some use cases.

Interface based coding

Did you ever sort a collection? What method did you use? Array.Sort()? List.Sort()? Why not IList.Sort()? Because there is no IList.Sort()!

One may argue that Microsoft simply forgot to put Sort() into the IList interface or to provide an ISortedList interface. But what about Project(), RemoveInvalidValues(), Normalize()? Matter of fact, it is simply not possible to foresee every demand and not feasibly to put it into general purpose interfaces.

The price to pay is that implementing sorting in the list classes sacrifices interface based abstraction and binds the calling code unnecessarily to implementation classes. But what other options did we have anyway?

The sorting algorithm could easily be implemented working on IList, independent of the implementation class. Yet the method would have to be put somewhere, usually in some helper class (like ListHelper.Sort(IList)). Microsoft did not do that because it violates ease of use principles outlined in “Framework Design Guidelines” (indeed a very valid reason).

However, since .NET 3.5 we can easily solve that problem by declaring our static Sort method as extension method, making it appear as if it is a method of the interface:

void ListHelper.Sort(this IList) {…}

OK, one step back! Doesn’t that violate OO principles? Shouldn’t the the list be able to sort itself? Just like a deck of cards? Well, last time I played cards I had to sort … I mean shuffle them. Sorting and shuffling are algorithms working on lists, not intrinsic features of said lists.

Thus we have a possible idiom:

  1. Provide interfaces that describe core characteristics of data structures (such as IEnumerable, IList, ISerializable, …).
  2. Put methods that implement algorithms working on core characteristics in extension classes for the respective interface.

This helps keeping interfaces straight, and it improves reuse of algorithms.

Segregate context specific service methods

Take an entity class, say Address. There are certain things one could do with an address, e.g. provide a formatted string for printing, checking if it is a valid address, and so on. I said “do with” not “do to”; again I’m talking about algorithms working on the address, rather than manipulating it. And they don’t need access to the inner workings of the object.

Still we could implement the methods within the address class; the motivation for the previous pattern was reuse of the algorithm, but how many addresses are we going to have?

So let’s put the Validate() method (used by the business logic) into the entity, the FormatForPrinting() Method (used by the UI), the GenerateUniqueId() Method (used by the data access layer), … damn, that format method produced HTML but this client needs XAML… . And is this IsReadOnly() method for the UI or the data access layer?
Got the idea? There is reason not to put those methods into the entity class: they do not only depend on the entity but also belong to their specific context, in this case a certain layer.

Putting those methods into a helper class could have been done before, but again, how many helper classes shall we have to remember? Extension methods can solve that particular problem.

So, I have another idiom:

  1. Keep context dependent methods out of classes which are used in different contexts.
  2. Put those context specific methods into context specific extension classes.

Doing this, the entity class stays clean and pure, no undue features, no pesky helper methods, only nice little properties.

Separation of core implementation and service methods

Take one of my favorite classes, a factory class. It probably provides a CreateObject(…) method taking certain parameters that does the main job of creating objects. Of course it would take the maximum of available parameters, and of course it would have certain overloads to make things easier to use:

object CreateObject(Type type, params object[] args);
// Service methods:
T CreateObject<T>(params object[] args);
object CreateObject(Type type);
T CreateObject<T>();
object CreateObject(string typeName, params object[] args);
object CreateObject(string typeName);

I wouldn’t have problems to further add type safe overloads. What’s the result? While there is just one relevant method, a bunch of additional method complicates the class, broadens its surface, needs to be tested, may do things a little differently… and at the end the overload I just needed would still be missing.

Extension methods solve this problem in two aspects: First they allow moving the convenience methods out of the factory class without sacrificing the usability from the callers point of view. Secondly, I can add my special incarnation of a CreateObject method in my private extension class and it would seamlessly appear beside (or even replace) all the other methods.

So, I have another idiom:

  1. Keep the implementation of classes to the core functionality.
  2. Put additional convenience methods into extension classes.

Mixin classes

The idea of using extension methods for mixins was introduced with Create Mixins with Interfaces and Extension Methods. This article uses extension methods working on a special interface, the combination of them forming the mixin. A class that needs some orthogonal functionality would simply implement that interface and benefit from the provided implementation.

The idiom described looks like:

If you have an algorithm or feature you want to apply in an egalitarian way to different classes…

  1. use an interface to provide uniform access to the classes…
  2. and provide the algorithm as extension method to that interface.

I may have seen those two points before…?!

As a side note: Wikipedia defines mixins as follows:

“[…] a mixin is a class that provides a certain functionality to be inherited by a subclass, but is not meant to stand alone. Inheriting from a mixin is not a form of specialization but is rather a means to collect functionality.” http://en.wikipedia.org/wiki/Mixin

Seen from a C++ point of view, the mixin approach relies on several tenets: Based on inheritance of functionality, the mixin class is a fully fledged class that provides actual code (as opposed to interfaces). This quite often included code placed in constructors and the destructor, thus implicitly adding behavior. At the same time, the reuse by inheritance approach does not interfere with the regular inheritance hierarchy, as C++ supports multiple inheritance. And it also does not interfere with the interface of the class (i.e. it can be realized that way in case it is undesired), as C++ allows protected and private inheritance.

Of course this is not possible with .NET, as multiple inheritance is restricted to interfaces and always public. Thus I am inclined to call the mentioned approach a very poor realization of mixins. It falls short of covering any aspect I used mixins for with C++: I don’t inherit functionality, say by providing default implementations. I don’t add behavior, as no logic is changed until I explicitly call the extension method. And I cannot even dream of making this an implementation detail of the class, as it solely relies on changing the public interface of that class.

Putting things together…

These usage scenarios may have provided different motivations, but in essence they have followed the same ideas and led to similar idioms. In a more generalized form:

  1. Keep methods out of your class or interface …
    • if they implement algorithms working on the class rather than features of the class
    • if they depend not only on the class/interface, but also on the context
    • if they merely add convenience overloads
    • if they would not have been available in the first place
  2. Provide extension classes …
    • that capture algorithms working on objects and especially implementations working on interfaces
    • that cluster context specific features
    • that add shortcuts, convenience methods, and helper methods.

Now, all this could have been done before with static helper classes. And having extension methods at our disposal doesn’t change the work on the providers side at all. The big difference however is at the users side. While the classes he works with remain simpler and more concise at first glance, just by adding a using statement he can dynamically enhance those classes with exactly the bunch of features he just needs. This is almost like dynamically extending class definitions in dynamic languages.

Putting it together, this has (or rather may have) several effects:

  • It supports the information hiding principle for abstract data types as only the absolutely necessary methods get access to the inner workings.
  • It elevates separation of concerns as classes and algorithms are separated.
  • It improves cohesion as context dependent features are moved to their respective context.
  • Reusability should improve if the extension methods can be implemented to work against interfaces and thus become more applicable.
  • Testing those algorithms should improve as those interfaces may also be implemented by mock objects.
  • Testing the classes will be easier due to the smaller and more concise surface to test.
  • And finally classes and interfaces that have a smaller surface are less complex and easier to keep consistent.

If you ask me, I didn’t expect such a compelling list of benefits when I started locking into this topic. Whether they may be realized remains to be seen, but after having worked on this post, I am willing to give it a try.

That is, as long as nobody points out any flaws I overlooked. (OK guys, that’s a challenge!)

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

Create a free website or blog at WordPress.com.