The last post was a bit dry; this time let’s do some coding first and add the justification afterwards.
I now and than have to create XML within code. Just to have an example to work on, take this configuration file fragment:
<?xml version=“1.0” encoding=“utf-8” ?>
<configuration>
<appSettings>
<add key=“email” value=“nospam@mysite.com”/>
<add key=“email2” value=“nospam@othersite.com”/>
</appSettings>
</configuration>
To create this little XML on the fly, using plain .NET BCL means, the code would look like this:
private static XmlDocument CreateAppConfig_Manual()
{
XmlDocument doc = new XmlDocument();
doc.AppendChild(doc.CreateXmlDeclaration(“1.0”, Encoding.UTF8.BodyName, “yes”));
XmlElement eConf = doc.CreateElement(“configuration”);
doc.AppendChild(eConf);
XmlElement eSettings = doc.CreateElement(“appSettings”);
eConf.AppendChild(eSettings);
XmlElement eAdd = doc.CreateElement(“add”);
eSettings.AppendChild(eAdd);
eAdd.SetAttribute(“key”, “email”);
eAdd.SetAttribute(“value”, “nospam@mysite.com”);
eAdd = doc.CreateElement(“add”);
eSettings.AppendChild(eAdd);
eAdd.SetAttribute(“key”, “email2”);
eAdd.SetAttribute(“value”, “nospam@othersite.com”);
return doc;
}
It’s quite obvious that this C# method creates the XML above, is it not? Well, it does, as long as I did not forget to append that element over there after creating it. And appended it to the correct parent node. Which had to be created and appended to… which node?
Improving the code with helper methods
This is exactly the situation in which I usually come up with a little helper class:
public static class XmlHelper
{
public static XmlDocument CreateXmlDocument(Encoding enc)
{
XmlDocument doc = new XmlDocument();
doc.AppendChild(doc.CreateXmlDeclaration(“1.0”, enc.BodyName, “yes”));
return doc;
}
public static XmlElement CreateElement(XmlDocument parent, string name)
{
XmlElement e = parent.CreateElement(name);
parent.AppendChild(e);
return e;
}
public static XmlElement CreateElement(XmlElement parent, string name)
{
XmlElement e = parent.OwnerDocument.CreateElement(name);
parent.AppendChild(e);
return e;
}
}
Tiny little methods, not doing very much, thus not worth very much? On the contrary. (In reality I would add some parameter checks, but this is not my point right now.) Have a look at the polished up code:
private static XmlDocument CreateAppConfig_Helper()
{
XmlDocument doc = XmlHelper.CreateXmlDocument(Encoding.UTF8);
XmlElement eConf = XmlHelper.CreateElement(doc, “configuration”);
XmlElement eSettings = XmlHelper.CreateElement(eConf, “appSettings”);
XmlElement eAdd = XmlHelper.CreateElement(eSettings, “add”);
eAdd.SetAttribute(“key”, “email”);
eAdd.SetAttribute(“value”, “nospam@mysite.com”);
eAdd = XmlHelper.CreateElement(eSettings, “add”);
eAdd.SetAttribute(“key”, “email2”);
eAdd.SetAttribute(“value”, “nospam@othersite.com”);
return doc;
}
To me this looks way better. It’s shorter, better readable, and the code contains what I want to do, not how to do it.
Improving the code with extension methods
OK, plain helper methods were yesterday, today we have extension methods. But carefully, or you’ll step into the same trap as I did!
I started doing it naively by adding the “this” keyword to all first parameters and changing the calls accordingly. It compiled smoothly, ran smoothly… and produced an empty document. Why? Because the CreateElement(this XmlDocument) collides with the existing method XmlDocument.CreateElement(). And neither intellisense, nor the compiler, nor the runtime bothered to tell me.
Lesson learned: Be careful with extension method names!
Anyway, all this would change would be two calls to CreateElement() which is not worth the effort. Actually the thing that bothers me the most with the above code is all those temporary variables. But the reason I need them is (only) that I need to apply several calls on them. Enter Fluent Interfaces!
Improving the code with fluent interfaces
The very idea of fluent interfaces is that one method that would otherwise be declared as void, returns a reference that can instantly be used for the next call. So, what do you think about this little beauty:
private static XmlDocument CreateAppConfig_Fluent()
{
return XmlHelper.CreateXmlDocument(Encoding.UTF8)
.NewElement(“configuration”)
.NewElement(“appSettings”)
.NewElement(“add”)
.SetAttrib(“key”, “email”) // name collsion
.SetAttrib(“value”, “nospam@mysite.com”)
.Parent() // type safe version of ParentNode
.NewElement(“add”)
.SetAttrib(“key”, “email2”)
.SetAttrib(“value”, “nospam@othersite.com”)
.OwnerDocument;
}
No more unnecessary temporaries, and way more readable as the code indentation even reflects the XML structure.
Of course I had to provide respective extension methods to allow this method to be written:
- I changed the CreateElement() extension methods to NewElement() to avoid the name collision. I had another name clash with the SetAttribute() method, which I needed to provide because…
- every method had to return a value (to pass the reference along).
- In order not to “break the chain” I had to add methods to “navigate” in cases were the returned value was not the one to work on afterwards, e.g. I had to provide the Parent() method.
- And finally I needed the correct return type, i.e. the most derived type, in order to apply the specific extension methods. This was why I could not use XmlNode.ParentNode for “navigation”. It has the wrong type of XmlNode rather than the required XmlElement.
Here’s the respective extension class:
public static class XmlHelper
{
public static XmlElement NewElement(this XmlDocument parent, string name)
{
XmlElement e = parent.CreateElement(name);
parent.AppendChild(e);
return e;
}
public static XmlElement NewElement(this XmlElement parent, string name)
{
XmlElement e = parent.OwnerDocument.CreateElement(name);
parent.AppendChild(e);
return e;
}
public static XmlElement SetAttrib(this XmlElement element, string name, string value)
{
element.SetAttribute(name, value);
return element;
}
public static XmlElement Parent(this XmlElement element)
{
return (XmlElement)element.ParentNode;
}
}
See how every method but one on one hand extends XmlElement and at the same time returns an XmlElement? This could probably be extended to XmlAttribute and other related classes. As long as I can build a chain of returned type used as subsequently extended type the fluent interface idea should work.
But I never liked fluent interfaces…
Fluent interfaces have been coined by Marin Fowler in his blog, december 2005, more information on the term may be found at wikipedia. He did however no exactly invent the idea, he merely was taking an C++ idiom a little further. I remember having returned *this in C++ in the 90s, and I certainly did not invent it then.
However, I also grew very wary of this idiom, and with C# I became an advocate against “call chains” (where every return value is instantly used to call the next method) in general, and fluent interfaces in particular. For just one simple reason: NullReferenceException. If one appeared, who could tell which particular call was the culprit, and why. With single calls, each in its own source line, you do know from the call stack and debug information which call failed. And running in a debugger you instantly had the temporary variables to tell you the current state of affairs.
Thus I concluded that fluent interfaces lack robustness. Then came LINQ.
This example is taken from an early introduction by none other than Don Box and Anders Hejlsberg:
IEnumerable<string> query = from s in names where s.Length == 5 orderby s select s.ToUpper();
translates to:
IEnumerable<string> query = names.Where(s => s.Length == 5).OrderBy(s => s).Select(s => s.ToUpper());
If that is not what I call a “call chain”, I don’t know what is. And it certainly is a very nice application of fluent interfaces. Thus it should be subject to the concerns I had. It is not. And not just because I wouldn’t dare to contradict Don or Anders (oh boy, I would have a field day if I could do that).
The reason I have no problem with LINQ is that it does not only employ fluent interface programming style, it is also backed up by functional programming concepts. And functional programming demands that methods do not return NULL values, but null objects. (Ironically we had that back in the old C++ days, as C++ references are not supposed to be null in the first place, contrary to pointers.)
Apart from this theoretical justification, extension methods are static methods that can check the “this” reference and throw ArgumentExceptions that tell exactly what call was wrong. Something that is not possible with instance methods.
So, if you spice fluent interfaces up with some functional pepper, they become a very digestible meal.
Wrap up
Fluent interfaces provide a compelling concept that has been used in isolated spots, such as with StringBuilder. With LINQ it is also backed up with functional programming concepts and in turn becomes a more broadly usable concept. While LINQ itself is a great example of the employment of fluent interfaces, extension methods can be used to implement fluent interfaces in more general cases.
Does that mean I need extension methods to implement fluent interfaces? It does not! When I had already written much of this post I stumbled over “XmlDocument fluent interface“. Marks approach was a little different. Rather than extending the existing XmlElement class, he returned his own tool class. This way he could avoid the naming collisions and restrict the callable methods to just the fluent interface and nothing else. And he does not use extension methods.
My example shouldn’t be taken too literally, as it was only meant to highlight the concept, not to provide a robust implementation. It should at least have shown that fluent interfaces are a feasibly thing to do with extension methods. In many cases, all it takes is “return x;” Without extension methods, this would not have been possible.
I am finding it particularly amusing that the C# Programming guide says (Section “Extension Methods”) “Therefore, the principle of encapsulation is not really being violated”. Not really! Just a tiny bit pregnant.
Comment by Brian — March 30, 2008 @ 5:16 pm