AJ's blog

February 10, 2008

Extensively using extension methods

Filed under: .NET, .NET Framework, C#, LINQ, Software Developers, Software Development — ajdotnet @ 3:00 pm

What was meant as a single post now seems to become another little series: I have started looking at the new C# language features (introduced with/for LINQ) in isolation, trying to assess whether they can live up to being features in their own right.

I have covered partial methods (strictly speaking not a LINQ related feature). Anastasios has written a blog post about anonymous types, so I’ll cover that aspect simply by referral.

Now for extension methods…

If you need a recap on extension methods, those links might suit you:

Now let’s look at a few characteristics and consequences; nothing special, most of it should be obvious anyway.

Basic usage

Example: Check if an array is empty…

public static class ArrayExtensions
{
    public static bool IsEmpty(this Array array)
    {
        return (array==null) || (array.Length==0);
    }
}

Now look at this code using the extension method:

int[] ia = new int[] { 1, 2, 3 };
if (ia.IsEmpty())
    Console.WriteLine(“ia is empty”);

The thing to note is that a property IsEmpty would be so much nicer. Alas, extension methods — at least in the current implementation — are limited to plain instance methods.

Let’s stress that:

public static class ArrayExtensions
{
    public static int[] CreateNumbers(this Array array, int start, int end)
    {
        return new int[] { start, end }; // if you find that peculiar, you’re right 😉
    }
}

This is a factory method, a method that creates an instance. It would be so nice to call it on the type, e.g.

int[] ib = (int[]).CreateNumbers(1, 5);

And yet, this would require static extension methods. Something that has been discussed in a forum thread, well, rebuffed would be a better description… 😦 . I also could find uses for “extension const values” but there’s no sense in starting that discussion.

Another piece of code calling into extension methods:

int[] ic = null;
if (ic.IsEmpty())
    Console.WriteLine(“ic is empty”);
if (ArrayExtensions.IsEmpty(ic))
    Console.WriteLine(“ic is empty”);

There are a few things to point out:

  • Look at the second line in the above code. This code runs nicely, no NullReferenceException! Despite the fact that extension methods are called via instances, the “this” argument may still be null and the extension method should make sure it can handle that (thus the array==null check above).
  • Look again at the second line. Partly a consequence of the previous point, there is no way to know whether IsEmpty() is a call to an ordinary instance method (which would certainly throw an exception) or to an extension method (which would not). Thus you also cannot decide whether this is an erroneous code fragment or not.
  • Now look at the second “if” statement. It still is possible to call extension methods like you would with ordinary static helper methods.

What I make out of these findings is that instance methods and extension methods may look the same at the calling code fragment, yet there is a semantic difference (i.e. different behavior at runtime) one should be be aware of. And the fact that this difference is not apparent in the calling code somewhat blurs the code if you look at it from a reviewers point of view.

Ambiguity

Let’s count some numbers:

public static class ArrayExtensions
{
    public static int GetSafeCount(this Array array)
    {
        // allways check null value!
        return (array == null) ? 0 : array.Length;
    }
}

public static class CollectionExtensions
{
    public static int GetSafeCount<T>(this ICollection<T> col)
    {
        // allways check null value!
        return (col == null) ? 0 : col.Count;
    }
}

Suppose you import the first class, then you write the following code:

if (ia.GetSafeCount() == 0)
    Console.WriteLine(“ia is empty”);

After you import the second class, the compiler will greet you with the following message:

D:\…\BasicTests.cs(68,17): error CS0121: The call is ambiguous between the following methods or properties: ‘Extensions.ArrayExtensions.GetSafeCount(System.Array)’ and ‘Extensions.CollectionExtensions.GetSafeCount<int>(System.Collections.Generic.ICollection<int>)’

The bad message is: Just by importing another namespace you add an ambiguity. This is nothing new, if you happened to code against Control with using System.Web.UI; and you added using System.Windows.Forms; you would run into the same problem. With two notable differences: One, this ambiguity is on class level, not on method level. Two, you could circumvent it with a using WebControl = System.Web.UI.Control; directive. This is not possible with extension methods.

The good part is that the compiler will whine until you made your intentions clear. Or does it? Look at this:

public class Test1
{
    public void Foo()
    {
        this.Test();
    }
}

public class Test2 : Test1
{
}

public static class TestExtensions
{
    public static void Test(this Test1 test)
    {
        Console.WriteLine(“Test 1 called!”);
    }
}

One class derived from another, an extension method for the base class. This is valid code and of course it calls the extension method:

Test2 t2 = null;
t2.Test();

Now import another extension method:

public static void Test(this Test2 test)
{
    Console.WriteLine(“Test 2 called!”);
}

Guess what? For the call to Test() in the above example intellisense will show both methods — without any way to distinguish them. And the compiler will not complain, yet at runtime the new extension method will be called, because the type matches better. Thus, by simply adding a using directive you have changed the semantics of your code radically, calling different methods and perhaps not even noticing it.

BTW: In case you wonder what the Foo() method in Test1 is for: I just wanted to stress the fact that a class can call extension methods aimed at itself. All it needs to do is invoke them via the this pointer.

Importing extension methods

I already talked a little about importing extension methods. But there’s something that should be made very clear:

You don’t import some extension methods. You don’t import some classes holding extension methods. You import all extensions methods in all classes in a particular namespace.

Lemma 1: It is not possible to import single extension methods.
Lemma 2: It is not possible to import a single class carrying extension methods. In other words, the following using directive does not import the extension methods in that particular class:

// introducing only one particular class’s extensions won’t work
using ImportExtensions = Extensions.ArrayExtensions;
// we have to introduce all extension methods in all classes in that namespace
using Extensions;

Lemma 3: It is not possible to import a namespace without importing the extension methods as well.

As a consequence it may be wise to keep extension classes in separate namespaces. More to the point, don’t put extension methods for common classes (such as Object, String, or Control) in commonly used namespaces. It may even be a good idea — considering that the class containing the extension methods is of less meaning than usually — to put extension methods for certain purposes in distinct namespaces, i.e. use the last namespace part in a way you would normally use a class.

Reflection

The obvious part regarding reflection is that the class being extended does no know anything about the extension methods. Calling typeof(MyClass).GetMethods() will always return the same result, always excluding the extension methods. So it serves one to think about extension methods as a pure compiler feature that never makes it to IL.

This is however only partly true and only part of the truth…

First the truth: The calling IL code is identical for extension method calls and simple static method calls:

L_006b: call bool Extensions.ArrayExtensions::IsEmpty(class [mscorlib]System.Array)

L_0085: call bool Extensions.ArrayExtensions::IsEmptyNonExt(class [mscorlib]System.Array)

Now the untrue part: There actually is some difference, otherwise Reflector would not be able to decompile the following fragment:

if (ic.IsEmpty())
    Console.WriteLine(“ic is empty”);
if (ic.IsEmpty())
    Console.WriteLine(“ic is empty”);
if (ArrayExtensions.IsEmptyNonExt(ic))
    Console.WriteLine(“ic is empty”);

So there has to be some information about extension methods in the IL code; however not at the calling side, for otherwise Reflector would have presented the second call correctly as the plan static call that was in the original code.

The answer lies in the missing part: The extension method itself is marked as such, using an attribute. This is the IL (as Reflector shows it):

.method public hidebysig static bool IsEmpty(class [mscorlib]System.Array array) cil managed 
{ 
    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() 
    .custom instance void [mscorlib]System.Diagnostics.DebuggerStepThroughAttribute::.ctor() 
    .maxstack 2

I put the DebuggerStepThroughAttribute in for a reason. The same method presented by Reflector as C# code looks like this:

[DebuggerStepThrough] 
public static bool IsEmpty(this Array array)

As you can see, Reflector silently replaces the ExtensionAttribute with the this keyword.

There is another interesting thing about the attribute. If you look it up in the documentation it states the usual “This API supports the .NET Framework infrastructure and is not intended…”, but it also tells us that it targets not only methods, but classes and assemblies likewise. Consequently my test class contains the attribute (again, only visible in IL view in Reflector):

.class public abstract auto ansi sealed beforefieldinit ArrayExtensions 
    extends [mscorlib]System.Object 
{ 
    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor()

As well as the assembly (and hey! this time in C# view):

// Assembly ConsoleApplication1, Version 1.0.0.0    

[assembly: Extension]

Best guess is that this is for faster lookup of extension methods (or rather ruling their presence out).

LINQ

I guess I cannot avoid to talking about the relationship between LINQ and extension methods. Have a look at that code:

int[] ia = new int[] { 1, 2, 3 };
var result = from num in ia where num < 2 select num;

In a sense LINQ is all about translating query keywords into “lower level” language features. Reflector presents the relevant part of this code as follows:

CS$<>9__CachedAnonymousMethodDelegate1 = delegate (int num) { return num < 2; };
IEnumerable<int> result =
ia.Where<int>(CS$<>9__CachedAnonymousMethodDelegate1);

As you can see, the where keyword is translated to a call to the extension method Where (declared in System.Linq.Enumerable).
(Note: If you look at that method in Reflector it appears to be empty, just like all extension methods in that class. I’ll leave that riddle to someone else…)

Verdict

Extension methods are no doubt an interesting new language feature. They have their pitfalls, but which feature does not. My main concern is that the IDE doesn’t tell me which extension method I’m calling. Tooltips as well as Intellisense tell me a method is an extension method, but not in which namespace/class it resides.

On the other hand it’s quite obvious that extension methods are not there as a feature in its own right but as a means to support LINQ. It actually does support LINQ very well — but it does not go beyond. Had it been otherwise, there would be means to at least some of those areas in which class extension methods currently fall short:

  • Extension properties
  • Static extension methods
  • Better ways for controlling the import of extension methods (e.g. import classes rather than namespaces)

OK, that’s it. Seems I’m not capable of writing short posts…

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

6 Comments »

  1. One question, besides all this technobabble:

    Don’t extension methods violate OO-principles? I mean, a member looks like a member, but isn’t? It can’t see the internals of the type. It behaves different. I would long for better aggregation or delegation mechanisms.

    And what is the beef of calling an extension using this? Doesn’t that mean both must reside in the same compilation unit? Why use an extension in that case?

    I understand that LINQ needs extension methods. For anything else somebody has to make a very good point to me before I use them (I tried it in one project. It had the word disaster written all over it). If you try to factor your code better with them, forget it.

    About the same verdict as for type inference using var: extremely useful for LINQ (which would be unusable without var). Can’t see any usecase for non-LINQ-use.

    Comment by Dan — February 26, 2008 @ 5:19 pm

  2. @Dan: This post was all about technobabble (the high-level view is still under construction)… 😉

    1. Regarding OO-principles: One could also argue that extension methods actually support the information hiding principle (one of the tenets of OO) as they allow keeping the data type restricted to the necessary core. Anyway, my opinion is that they’ve simply got nothing to do with OO. They are an orthogonal means like properties, events, delegates.

    2. extension methods have to be called via an instance, and ‘this’ is just one special case. However, that’s got nothing to do with the compilation unit, rather extension methods can reside in any referenced assembly.

    3. I guess extension methods are just like any other new language feature. We had to learn how to use virtual methods, templates, interfaces, now we have to learn how to use extension methods.

    4. ‘var’ is about type inference and an integral feature of functional languages. Have a look at http://msdn.microsoft.com/msdnmag/issues/08/LA/FSharpIntro/default.aspx for some additional use cases.

    HIH,
    AJ.NET

    Comment by ajdotnet — February 26, 2008 @ 10:27 pm

  3. RE 2:
    How do you compile

    this.MyExtensionMethod

    if MyExtensionMethod is in a different compilation unit that the class itself?

    Comment by Dan — February 27, 2008 @ 11:16 am

  4. @Dan: You can use extension methods on ‘this’ just like any other extension method called via any other instance:
    – set a reference to the assembly that contains the extension method
    – add the necessary using directive
    – use it

    There is really no difference between ‘this’ and a variable here and you could allways substitute the ‘this’ reference:
    MyClass x= this;
    x.MyExtensionMethod

    HIH,
    AJ.NET

    Comment by ajdotnet — February 27, 2008 @ 9:09 pm

  5. Hi,

    Do you know if static extension method will be included in C# 4?

    Comment by Vitaly — March 6, 2008 @ 9:30 pm

  6. @Vitaly: I have no idea. I suggest keeping an eye on the C# tem member blogs (see http://msdn2.microsoft.com/en-us/vcsharp/default.aspx ).

    Comment by ajdotnet — March 7, 2008 @ 8:24 am


RSS feed for comments on this post. TrackBack URI

Leave a reply to ajdotnet Cancel reply

Blog at WordPress.com.