AJ's blog

September 30, 2007

Workflow Instance State Management

Filed under: .NET, .NET Framework, C#, Software Architecture, Software Development, WF — ajdotnet @ 4:34 pm

Note: This is part of a series:
1. Talking WF (introduction)
more to come (communication, hosting, error management).

Let’s skip the “how do I reference the necessary assemblies” stuff and assume the boilerplate work of setting up the WF engine and starting workflows has been done. (There’s plenty documenatation about this available elsewhere.) So let’s asume we have a page that starts workflow instances and the workflow is in place and running. Now, when you start thinking about a workflows you‘ll sooner or … well, even sooner, ask questions like „How dows this piece of information get to my workflow instance?“ and „How does that piece get out?“. In one word: data exchange.

Let‘s make that more tangible with an example: Let’s asume I‘m the project manager and I just got a notice that the guy across the room wanted to go on vacation when I planned going-live. Damn his insolence! How do I cancel his request?
I start up Vacations@SDX, select the page that gives me a list of open vacation requests, and click the „cancel vacation“ button for the one in question. The page refreshes and the state of the vacation request becomes „canceled for undue insolence“ (Oh boy, if any of that happend, my collegues would kill me. But nastiness is so much fun 👿 ).

There a two things that happend here that are noteworthy:

  1. I saw a list of open vacation requests. And each request should be represented by a running workflow instance. Did I see a list of workflow instances? Workflow instance state management.
  2. I clicked a button and somehow the workflow took over and did something. As you know, this happens asynchronously, nevertheless I saw the correct state after the processing instantly. To make that work, the web application has to have a means to know when the state has changed. Workflow communication.

Workflow instance state management

The workflow instance has state. It knows about data somehow coming in, knows what activities have been executed and so forth. Therefore it would be possible to query all workflows, load them, and ask about certain details, e.g. read some properties. Wait, the workflow instance might process something asynchronously… . How do you synchronize? What if the workflow just terminated because you loaded it and it somehow finished? Do you really want to load thousands of workflow instances for every query? (OK, that‘s probably beyond the example application, but still…).

Once you‘ve thought along that line you‘ll realize that you need to maintain the business state (the vacation request in our example) outside of the workflow instance, i.e. in a database table. But now you have to face the problem of keeping the data in this table and the related workflow instance in sync. Here you have two feasible options:

  1. Keep the workflow instance oblivious of that table, let the web application read and write into the table (kind of a cache).
  2. Let the workflow maintain its state in the table directly, let your web application read the table.

Some people might vote for option 1, and with good reason. It keeps the business logic in one place, the web application; you don‘t have to deal with technically concurrent database access in semantically single „transactions“). It clearly limits the workflow instance to workflow stuff, which promotes separation-of-concerns.

wf_stateAll well and good. But I go for number 2. For two reasons: Number 1 would imply more communication and synchronization between workflow instance and web application (we‘ll look into that). And the main reason: If my workflow is long running, it might change its state (say to „timeout because no one cared“) at a time when no user is online and the web application is fast asleep. With number 1 some other piece of code would have to do what the web application already does.

The pattern we came to appreciate goes like this: 

  • Mirror the workflow state in a table (containing the state and the workflow instance ID)
  • Let the workflow maintain that state
  • Let the web application query that state
  • Provide cleanup, e.g. remove the respective row when the workflow finishes.
  • Ensure consistency (e.g. put in validation code that verifies that the information within the table is consistent with the workflow state, provide some diagnostic tool that checks whether the workflow instances still exists)

This way it should be far easier to provide overview lists. And to interact with workflow instances from your web application you would need a table with workflow IDs anyway.

The next post will look into the workflow communication topic, the actual data exchange with the workflow instance.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

September 22, 2007

Talking WF…

Filed under: .NET, .NET Framework, ASP.NET, C#, Software Architecture, Software Development, WF — ajdotnet @ 5:11 pm

WF (for the record: Windows Workflow Foundation) is great. If you need it, you really need it. If you have the engine and one workflow running, adding other workflows is no sweat at all. If your workflows are persisted, you won‘t have to care for shutdowns. If you carefully designed your workflow you can use the graphical view for discussions with the user. If you set it up correctly, even errors can be handled by workflows.

Well, WF is great. But there certainly are a lot of “if”s here…

I was at the PDC 05 where Microsoft introduced the WWF (later to be renamed to WF due to some concerned Pandas 😉 ) as THE big surprise. But I had to wait until earlier this year to get my “production hands” on it. Curiously that happened more or less simultaneously in more than one project. And curiously, despite the different nature of the task at hand they turned out to be very similar. Similar in terms of the problems, similar in terms of the pitfalls, similar in terms of the solutions. And also very similar in the demand for knowledge of things that are not WF specific but are far from common knowledge for the average developer… (and with that I mean the average knowledge areas, not developers of average knowledge… did that make sense? I have no intention to disqualify someone for not knowing all the stuff that‘s following!)

Curiously^3 while there is so much information available about WF—hosting, data exchange, and what else—it still did not answer our questions. I rather got the impression that my problems were largely ignored. Or if they were mentioned it stopped right there. “Here is something you will have to think about!” Period. “You should use XY.” Period. “That will be a problem.” Period.

So I had to rely on my BizTalk background to fill the gaps (and my knowlegde about multithreading even goes back to OS/2, sic! Am I in danger of becomming a dinosaur?). In the following posts I will dig in one or the other topic and some of the “best practices” (if it can be called that) we came about with, which might help the next developer.

vacationBut for now let‘s just lay out the example application:

At SDX people work for various customers, usually on-site. If I would like to go to vacation I might have something to say about the matter myself. So does the customer. So does my account manager. And the back office. And… you get the idea. To manage that you need an informal workflow at least, once the head count of your company goes up you need it formalized. And this is what our example application (let‘s call it Vacations@SDX) does:

  • Show me how many vacation days I have left.
  • Let me request a new vacation.
  • Let all people concerned with my vacation have their say.
  • If everyone involved acknowledged the vacation write the information to the payroll application.

Of course since most people are mostly out of office, the application has to be a web application, the workflow long running, and notifications done by email. And the payroll application could be offline. And most importantly: DON‘T MESS WITH MY VACATION! Losing my request somewhere along the way simply WON‘T DO!

Nothing special here at all. Some kind of document review, order processing, or whatever workflow would have similar demands. Actually this is a very simple real world example. And yet, if done correctly it is a far from simple undertaking. Certainly it is not as simple as drawing a workflow and hooking up the basics as described in the available documentation.

On the list of things to talk about is workflow instance state, workflow communication, hosting, and error handling. This will be fairly basic stuff from an “architectural” point of view, but I will assume that you know or read up about the WF basics. I‘m not going to repeat the dozen lines of code it takes to host WF in a console application.

If that sounds interesting for you, stay tuned.

That’s all for now folks,
AJ.NET

September 12, 2007

IDisposable reloaded

Filed under: .NET, .NET Framework, C#, Software Development — ajdotnet @ 6:01 pm

One of my last posts was about IDisposable. (Well regarded by the way, thanks for the feedback 🙂 ) I should have thougt that covered the topic, but actually there are a few things I missed. And some of them certainly should be mentioned in “The Last Article about IDisposable“.

Reference: [FDG] Framework Design Guidelines, Cwalina, Abrams 

To IDispose or not to IDispose?

A friend recently asked me “I hear you, but when should I actually implement IDisposable?”. Couriously my own answer didn’t convince me then. Now I’ll try better…

From the disposable pattern point of view there are only two relevant kinds of resources:

  1. unmanaged resources (i.e. not managed by the CLR)
  2. managed, disposable resources (i.e. classes implementing IDisposable)

Regarding unmanaged resources: When are you actually dealing with unmanaged resources? Well, talk about COM interop or P/Invoke and Win32 handles. In these cases I would have a look at the the SafeHandle class. This class does all and more of what any of us could think of. Once the resource is properly wrapped, I deal only with the wrapper which is a managed disposable ressource. Done.

That was easy, wasn’t it? Although unmanaged resources are the canonical justification for the IDisposable pattern (and finalizers for that matter), most developers rarely if ever deal with them directly. Database connection? DbConnection is a managed class. File handle? System.IO.File likewise.

Now for managed resources. When should you implement IDisposable? [FDG] gives two guidelines that in my opinion provide a nearly sufficient answer, thus I added a third one:

  • DO implement IDisposable on types containing instances of disposable types. In other words “aggregation” is the keyword.
  • CONSIDER implementing IDisposable if the type is likely to have subtypes that might contain disposable types.
  • My own addition to that: CONSIDER implementing IDisposable if you lay some kind of infrastructure … (see my last post for an example).

The first guideline is probably the most common case. And it has a pitfall. It says “instances of disposable types”, not “fields of disposable types”. There “Does any of the data members of my class have a type that implements IDisposable?” is actually the wrong question to ask! All the data member declaration tells you is the type of the data member; not the type of the instance it holds. Therefore the correct question would be “May any of the instances the data members of my class reference implement IDisposable?”. So take a hard look at your members of type object or any interface or base classes that would fall under the second guideline.

To finalize or not to finalize?

Most if not all articles about IDisposable present the pattern the way I did: Have the IDisposable.Dispose method call a virtual method Dispose(bool), have the finalizer call the same method. But [FDG] takes a different approach:

  1. The “Basic Dispose Pattern” only takes the IDisposable interface into account and effectively addresses managed resources. It does not include the finalizer!
  2. The “Finalizable Types” pattern acts as extension to the former, it adds the finalizer and deals with unmanaged resources.

The rationale for the basic dispose pattern? Well, why call the Dispose(bool) method from a finalizer if no unmanged ressource has to be freed? (And remember that you are not allowed to touch managed resources in the finalizer, as they might already be finalized.) Why pay for the finalizer and add the runtime effort (maintaining the object in the finalization queue) that comes with it in the first place?

At first I had some severe concerns (basically about the increased complexity and the implied risk of missing finalizers) and was inclined to contradict people like Herb Sutter 😳 . But actually it really simplifies the picture. The two kinds of resources mentioned above actually match the guidelines:

  • Unmanaged resources: SafeHandle implies (and goes beyond) “Finalizable Type“.
  • Managed ressouces: “Basic Dispose Pattern” and no finalizer!

The implication is interesting: No need for you to ever write a finalizer!

But Wait! What about these cases:

  • [FDG] also talks about “resources that need to be freed explicitely and that do not have finalizers”.
    In other words: unmanaged resources (already covered) or managed resources that are probably not implemented correctly (which rather needs a bugfix or workaround).
  • My initial concern: Having a class that implements IDisposable and at the same time omits the finalizer would be something unexpected and a surprise for most developers. So what if a derived class suddenly needed the finalizer and the developer wrongly assumed it was already there?
    The reason to need a finalizer would have to be an unmanaged resource. This calls for a wrapper class (e.g. SafeHandle) which in turn contradicts the use case.
  • What about classes that already implement a finalizer (and don’t have to)?
    Don’t worry. If it does the right thing it does nothing. So why bother?

Because it is rather unexpected, I’d like to emphasise the corollary: If you rely on SafeHandle or some similar wrapper class, you’ll never need to provide a finalizer! (This appears to contradict most IDisposable advice, including IDisposable Interface.)

Exceptions, exceptions, 5 cent each!

When .NET was still fresh there was a guideline not to throw exceptions in finalizers, because they would kill the finalizer thread and prevent other finalizers from being called. Early versions also didn’t guarantee that finalizers would be called during shutdown. Today the situation is as follows:

  • Objects will be finalized during shutdown by default (since .NET 1.0)
  • Exceptions will kill the process rather than the thread (since .NET 2.0). This really improves the situation, as the exception will not stay unnoticed (as before).
  • The guideline is that exceptions in finalizers should be limited to critical ones intendend to shut down the application. The guideline for Dispose is similar. ([FDG])
  • C# finalizers of derived classes will ensure that the finalizer of the base class is called even in the presence of an exception (see Finalize and Dispose Explained).

The last point is interesting. If you open the code of a class that overrides the finalizer in Reflector and choose C# format it will look like this:

~DerivedDisposableObject()
{
    this.Dispose(false);
}

The very same code in VB format however has a slightly different outfit:

Protected Overrides Sub Finalize() 
    Try 
        Me.Dispose(False) 
    Finally 
        MyBase.Finalize 
    End Try 
End Sub

All very well, the compiler does the work for us (as long as you are a decent C# developer 😀 ). But there still is one gap to fill: What about overriding the Dispose Method? Don’t the same issues apply here? They do. And does the compiler handle that as well? No, it doesn’t. Are we lost then? No, we aren’t.

First of all, I’m telling you how to do it right 😉 . Secondly, I convinced Microsoft to check whether you did follow my advice. There is a rule in FxCop and Code Analysis in Visual Studio respectively (CA2215:DisposeMethodsShouldCallBaseClassDispose). Take the following code snippet:

protected override void Dispose(bool disposing)
{
    if (disposing)
    {
        // free managed resources
    }
    
    // call base
    base.Dispose(disposing);
}

If you enable Code Analysis (I really wish this would be the default) it will generate the following warning:

MSBUILD : warning : CA2215 : Microsoft.Usage : Change DerivedObject.Dispose(Boolean):Void so that it calls its base class Dispose method for all code paths. Place the call in a finally block to ensure that resources are cleaned up when an exception is raised.

As it states, the correct version would be:

protected override void Dispose(bool disposing)
{
    try
    {
        if (disposing)
        {
            // free managed resources
        }
    }
    finally
    {
        // call base
        base.Dispose(disposing);
    }
}

Actually this little gem can be found in the documentation (e.g. in Implementing a Dispose Method), although rarely and never really advertised.

To null or not to null?

“Shouldn’t I set the data members to null?”. A question I got more than once in various incarnations.

So, just for the record: There is no need to set data members referencing disposable objects (or any object) to null. It does not help freeing resources and the garbage collector won’t work faster.
It might still be a good idea to set data members referencing disposable objects to null after disposing them. Just as a safety net. 

Other stuff…

Right about the same time I wrote my initial post, MS decided the topic is worth being adressed. There was an article on MSDN magazine in the CLR Inside Out column. I have to say, *ahaem*, it’s not nearly as complete as my coverage. But it mentions security and I got some hints on SafeHandle and try/finally in Dispose from it (the later one employed but not described, sic!).

Well, I hope THAT covers the topic. Or did I still miss something?

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

September 4, 2007

partial methods for partial developers

Filed under: .NET, .NET Framework, C#, Design Time, LINQ, Software Developers — ajdotnet @ 8:19 pm

The evolution of C# has all but stopped. One addition with .NET 2.0 was partial classes. Probably meant to ease the usage of code generators (keeping generated code separate from hand written code) the common developer is not limited to “consuming” it, he may also actively employ it. E.g. it might serve a puprose to separate production code from debug code. With .NET 3.5 there already was much talk about the language additions around LINQ, such as … well, just have a look at The LINQ Project or ScottGu’s Blog. And of course, these features are meant to be actively employed by the common developer.

Now we have a relatively new addition: Partial methods. (In case you haven’t hread, see In Case You Haven’t Heard, see Partial Methods for the VB.NET camp). With this addition I’m not so sure it’s meant for me… .

For the record: A partial method is a method declared in a partial class (hence the name). It may or may not be implemented in the same class (usually in another file). If it is implemented, the compiler will emit usual code. If it is not implemented, the compiler will emit nothing, not even metadata about that method. And will will also wipe out any calls to the method.

Ergo: Partial methods are purely a compiler feature. You need the code of the declaring class and it has to be prepared with declarations to support partial method implementation.

The usual (probably the only relevant) use case for partial methods is light weight event handling in combination with code generation. Say, you parse a database or an XML file and generate data classes. Tens of classes, hundreds of properties, all fairly generic, boilerplate stuff. And also fairly common is the need to interfere with this method for validation or that property setter to update another one. So instead of manipulating the generated code or having hundreds of events and thousands of method calls during runtime (of which in most cases only a tiny fraction will actually do something) partial methods kick in. They are easier to declare and use than events (pay for what you need) and if not used they simply vanish (pay as you go).

Any other relevant use case? None that I am aware of. In other words we have a language feature that is very efficient for a very limited number of use cases.

For whom are partial methods made?

Implementors of code generators are the ones to employ partial methods. Today this means Microsoft and LINQ related stuff, tomorrow Microsoft may decide to use partial methods in other event laden environments, such as ASP.NET and WinForms. In these cases the common developer only consumes partial methods. Let me re-phrase this: The common developer has to consume partial methods. Why? With partial methods the code generator will certainly not generate any other means of extension, such as events or virtual methods. The very purpose of partial methods is to do away with these heavy weight means, right?

The pro: If you follow the designer driven, RAD style, code based development style that ASP.NET or WinForms used for quite some time, the designer will eventually handle partial methods transparently. The only difference for the common developer is that he knows the stuff he is doing will be more efficient at runtime.

The con: If you like metadata driven applications (and already struggled with ASP.NET databinding because there is no way to attach meta data)… well, prepare for some further loss of information. If you need some event for any property setter (for tracing or tracking), if you need any kind of dynamic behaviour (e.g. to attach some kind of generic rules engine),  … let’s just hope the developer of the code generator anticipated that use case (or prepare for some work). You have a clean layer separation and the entities should know nothing about the UI? Well, put your code riht into the data class will spoil that. But hey, you might use partial methods to implement real events. Manually.

So you and I, the common developer that is, will not gain very much from this use case. But we will become more dependend on the ability, the mercy, and the foresight of the developers of the code generators and designers.

Will we be able to employ partial methods (rather than only consuming them)? Let’s see… when did I last write a class that had to support a vast amount of events? (That actually occasionally happens when I lay some infrastructure for the next project or work on framework code.) A class that was at the same time not intended to act as base class? (OK, forget about the infastructure!). But I surely wrote code by hand (because there was no code generator) that looks exactly like a candidate? But then I simply wrote the methods I needed, as I needed them — no need for a partial declaration.

So, unless I enter the camp of the not-so-common-code-generator-writing-developers (it happens, but only rarely), I can see no relevant use case that allows me to employ partial methods. (I really don’t count the example of increasing the readability conditional compilation — as presented in the VB post above — as relevant for a new language feature.)

Again, for whom are partial methods made? In my opinion they are made for Microsoft. To help them writing code generators that generate more efficient code. Code that conforms with the constant shift from object oriented (inheritance and virtual methods), component oriented (interfaces, events), metadata driven (attributes) development to a more and more exclusively used code generation approach. Highly efficient, but really not meant for me.

Do I have a better solution for the problem partial methods solve? I don’t. Therefore I can’t blame Microsoft for putting them in. Do I have concerns about how that will affect my work? I certainly do. Therefore I do hope the developers employing them do it with utmost caution. And with the awareness that not everyone uses their tools in a point-and-click kind of fashion.

There already has been some concern about partial methods in the .NET community — and for other reasons than the ones I mentioned: Language complexity, naming issues, other features higher on the wish list, and so on. I recommend reading the comments of the post above if you want to keep up with that. Whether partial methods are a good idea or not, they are easily the single most controversial language feature in C# so far.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

Blog at WordPress.com.