AJ's blog

July 26, 2009

Playing game…

We’ve defined the problem, we’ve established the rules, now we are ready to actually play the game. In other words: We will employ consistent and sufficient exception management in an ASP.NET application. No less!

Note: What follows is merely a sample implementation, just to drive the ball home, but it should turn my earlier musings into a little more practical advice. And since this is only meant to highlight the principle, not to provide a production ready solution, I’ll keep it simple.

Exception classes

First let’s try to classify our exceptions, depending on the supposed effect. This classification is sort of the driving force behind the strategy: who is responsible, or more to the point how to tell the responsible people…

A quick reminder on what our goal is:

  1. The user has to solve everything that results from his user user interaction in due process. This includes invalid input, insufficient permissions, or other business conditions that interrupt the current functionality.
  2. The administrator is responsible for infrastructure issues. This includes every call into the „outside world“, such as databases, services, file system. Issues here are manifold, including:
    • unavailable database or service
    • errors during some operation, e.g. unexpected a constraint violations
    • erroneous configuration
  3. Everything else: Bugs. Developer. You! 😉

So consequently:

  • If the user is responsible, we just tell him on the page and let him choose what to do.
  • If the administrator is responsible, we show a general error page, telling the user that the application is not available, and of course we write some event log entry.
  • If the developer is responsible, we do the same, but we tell the administrator explicitly to forward the event log information to the development team.

I’ll spare you the boilerplate exception implementation details, suffice it to say that we have a UserIsResponsibleException for “user exceptions” and a AdminIsResponsibleException for “admin exceptions”, respectively. Names that wouldn’t make it into my production code, but that’s what their intention is, anyway. Oh, and add a ConcurrencyException exception for respective issues, they may or may not be user exceptions, depending on the context.

Now let’s put the handlers in place. These reside in the UI layer, and from there we can work through the architectural layers towards the databases.

User error handlers

User exceptions have to be handled on the page level. Whenever you call into some business layer method, say from a button click event handler, you wrap the call in a try/catch(respective exceptions). The exception is then presented to the user, and eaten, meaning it is not thrown along. Something like this:

protected void btnConcurrentUpdate_Click(object sender, EventArgs e)
{
    try
    {
        BusinessLogic bl = new BusinessLogic();
        bl.UpdateDatabaseWithConcurrencyIssue("some content");
        lblConcurrentUpdate.Text = "did it.";
    }
    catch (UserIsResponsibleException ex)
    {
        // just give feedback and we’re done
        PresentError(ex);
    }
    catch (ConcurrencyException ex)
    {
        // just give feedback and we’re done
        PresentError(ex);
    }
}

The way of presenting the error message is something that has to be decided per application. You could write the message to some status area on the page, you could send some java script to pop up a message box. You could even provide a validator that checks for a recent user exception and works with the validation summary for the feedback.

Statistics:

  • Exception classes: Write once.
  • Presentation: Write once.

The try/catch code is simple enough, yet you have to do it again and again, and should the need for another user exception arise (e.g. some user errors can be handled on the page, some others require leaving the page), the logic is spread across our pages. Thus it makes sense to factor the actual handling into a separate method, in a helper class, or a page base class. Now, catching specific exceptions is still part of our job and that cannot be factored out. To solve this we have three options:

  1. Build the exception hierarchy accordingly and catch the base class for user exceptions.
  2. Catch all exceptions and throw the unwanted exceptions on.
  3. Live with the problem.

I decided against option 1 because I didn’t want the concurrency exception to be derived from some user exception. Option 3 is what we wanted to avoid in the first place. So my solution is number 2, in the name of ease of use and maintenance, but at the expense of language supported exception handling features:

protected void btnConcurrentUpdate_Click(object sender, EventArgs e)
{
    try
    {
        BusinessLogic bl = new BusinessLogic();
        bl.UpdateDatabaseWithConcurrencyIssue("some content");
        lblConcurrentUpdate.Text = "did it.";
    }
    catch (Exception ex)
    {
        // since C# does not support exception filters, we need to catch
        // all exceptions for generic error handling; throw those on, that
        // have not been handled…
        if (!HandleUserExceptions(ex))
            throw;
    }
}

This is boilerplate and unspecific enough to copy&paste it around. The generic handler may look like this:

bool HandleUserExceptions(Exception ex)
{
    if (ex is ConcurrencyException)
    {
        // just give feedback and we’re done
        PresentError(ex);
        return true;
    }
    
    if (ex is UserIsResponsibleException)
    {
        // just give feedback and we’re done
        PresentError(ex);
        return true;
    }
    
    // other errors throw on…
    return false;
}

Statistics:

  • Handler method: Write once.
  • Catching exceptions: Every relevant location, boilerplate copy&paste

Please note that exception filters might make this a little less coarse, but this is one of the rare occasions, where a CLR feature is not available to C# programmers. Anyway, we just factored all logic into a centralized method which achieves exactly what we asked for.

But wait… . Wouldn’t the Page.Error event provide a better anchor? It’s already there and no need to add try/catch around every line of code. Look how nice:

protected void btnConcurrentUpdate_Click(object sender, EventArgs e)

    BusinessLogic bl = new BusinessLogic(); 
    bl.UpdateDatabaseWithConcurrencyIssue("some content"); 
    lblConcurrentUpdate.Text = "did it.";
}

protected override void OnError(EventArgs e)
{
    // raise events
    base.OnError(e);
    
    var ex = Server.GetLastError();
    
    // if we don’t clear the error, ASP.NET
    // will continue and call Application.Error
    if (HandleUserExceptions(ex))
        Server.ClearError();
}

Unfortunately at the time the error event is raised, the exception will already have disrupted the page life cycle and rendering has been broken. It may help in case you need a redirect, but not if the page should continue to work.

“Other” error handlers

So, the page handles user exceptions, what about admin and system exceptions? The page doesn’t care, so they wind up the chain and ASP.NET will eventually pass them to the application, via the Application.Error event, available in the global.asax. This is the “last chance exception handler” or LCEH for short.

One thing we did want to do was writing the exception into the event log. I’ll leave actually writing into the event log to you, just remember that creating an event log source requires respective privileges, e.g. running as admin. Just a minor deployment hurdle.

protected void Application_Error(object sender, EventArgs e)
{
    Exception ex = Server.GetLastError();
    
    // unpack the exception
    if (ex is HttpUnhandledException)
        ex = ex.InnerException;
    Session["exception"] = ex;
    
    // write detailed information to the event log…
    WriteInformationToEventLog(ex);
    
    // tell user something went wrong
    Server.Transfer("Error.aspx");
}

private void WriteInformationToEventLog(Exception ex)
{
    // an ID as reference for the user and the admin
    ex.Data["ErrorID"] = GenerateEasyToRememberErrorID();
    
    string whatToDoInformation;
    if (ex is AdminIsResponsibleException)
        whatToDoInformation = "Admin, you better took care of this!";
    else
        whatToDoInformation = "Admin, just sent the information to the developer!";
    
    // put as much information as possible into the eventlog,
    // e.g. write all exception properties into one big exception report,
    // including the Exception.Data dictionary and inner exceptions.
    // also include information about the request, the user, etc.

    …
}

As you can see, there are some details to keep in mind. One, the exception has been wrapped by the ASP.NET runtime, and that exception class doesn’t tell you anything worthwhile. Second, passing information to the error page may be a little tricky in cases where the request has not yet (or not at all) established the session state. The easier approach is to let ASP.NET handle the redirection, yet at the loss of information. Three, be careful regarding the requests: depending on the IIS version you may get only .aspx requests, or any request including static resources. Thus, with this simplistic implementation a missing image will send you to the error page depending on the IIS version and registered handlers.

I would also like to encourage you to spend some time on collecting information for the event log message. A simple ex.ToString() doesn’t reveal that much information if you look closely. You should invest some time to add information about the current request (URL, etc.), the user (name, permissions), and server state (app domain name, to detect recycling issues). And use reflection to get all information from the exception, especially including the Exception.Data dictionary. Also go recursively through the inner exceptions. The more information you provide, the better.

I also hinted at an error ID. It is probably a good idea to give the user some ID to refer to when he talks to the admin. This way, the admin will know exactly what to look for in the event log (especially in multiple event logs of a web farm).

Finally, the exception class tells us whether we should include an “Admin, do your job!” or “Kindly forward this issue to your fellow developer, brownies welcome, too.” message.

And by the way: This is the one and only location in our application that writes to the event log!

Statistics:

  • Global error handler: Write once.

Done. We have the handlers in place. Handlers for user exceptions, admin issues, and any other exception that might be thrown. Now we need to take care that error causes are mapped to the respective exception class.

Business layer

The business logic is being called from the UI layer, and calls into the data access layer. Thus it has to check parameters coming in, as well as data handed back from the data source.

Data coming in hast to comply with the contract of the business class, and I’d like to stress the fact that not every data is legal. Meaning the business logic is not expected to accept every crap the UI layer might want to pass. A search method that takes a filter class as parameter? No harm in defining this parameter mandatory and require the UI to pass an instance, even if it is empty. And a violation is not the user’s problem, it’s the programmer’s fault! In these cases we throw an ArgumentException, and that’s it. The LCEH will take care of the rest, blaming the UI developer that is.

public IEnumerable<Customer> SearchCustomer(SearchCustomerFilter filter)
{
    // preconditions have to be met by the calling code
    Guard.AssertNotNull(filter, "filter");
    
    // input values have to be provided by the user
    if (string.IsNullOrEmpty(filter.Name))
        throw new UserIsResponsibleException("provide a filter value");
    
    DatabaseAccess db = new DatabaseAccess();
    return db.SearchCustomer(filter);
}

Any issue with the content of the data, i.e. the information the user provides, is a user issue. Usually this includes data validations and if you follow the rule book you’ll have to repeat everything the UI already achieved via validators. Whether you do that or just provide additional checks, e.g. dependencies between fields, is something I leave to you. Anyway, throw a user exception if the validation fails.

Calling the data access layer may also result in thrown exceptions. Usually there is no need to catch them at all. Period.

Finally the data you get from the database may ask for some interruption. A customer database may tell you that this particular customer doesn’t pay his dues, thus no business with him. In this case you throw a user exception in order to interrupt the current business operation.

And that’s it.

Statistics:

  • Business validations and interruptions: As the business logic demands…

“Uhh, that’s it? And I used to spend so much time on error handlers in my business logic…”. Nice, isn’t it? Just business logic, no unrelated technical demand.

Data access layer

The data access layer is where many exceptions come from. If you call a data source (web service, database, file system, whatever), the call may fail because the service is unavailable. Or it may fail because the called system reports an error (constraint violation, sharing violation, …). It may time out. Some exceptions won’t require any special handling and you can leave them alone. Some need to be promoted to admin or user exceptions, or they may require special handling for some reason. You should  translate those exceptions to respective application exceptions. The reason is  that you don’t want to bleed classes specific to a data source technology (e.g. LINQ 2 SQL exceptions) into the upper layers, introducing undue dependencies.

Luckily all this tends to be as boilerplate as can be. Call/catch/translate. With always the same catches. So, again, we can provide a “once and for all” (at least per data source technology) exception handler, like this one for the ADO.NET Entity Framework:

public IEnumerable<Customer> SearchCustomer(SearchCustomerFilter filter)
{
    try
    {
        using (DatabaseContainer db = new DatabaseContainer())
        {
            //…
        }
    }
    catch (DataException ex)
    {
        TranslateException(ex);
        // any exception not translated is thrown on…
        throw;
    }
}

And the respective translation method:

static void TranslateException(Exception ex)
{
    // database not available, command execution errors, mapping errors, …
    if (ex is EntityException)
        throw new AdminIsResponsibleException("…", ex);
    
    // translate EF concurrency exception to application concurrency exception
    if (ex is OptimisticConcurrencyException)
        throw new ConcurrencyException("…", ex);
    
    // constraint violations, etc.
    // this may be caused by inconsistent data
    if (ex is UpdateException)
        throw new AdminIsResponsibleException("…", ex);
}

Statistics:

  • Translation method: Write once.
  • Catching exceptions: Every relevant location, boilerplate copy&paste

Conclusion:

Done. Let’s sum up the statistics:

  • Write once (at least partly reusable in other applications):
    • Application exception classes
    • Last chance exception handler (LCEH)
    • Handler method for user errors, including presenting the error
    • Exception translation method in DAL
  • Boilerplate copy&paste
    • Catching exceptions in the UI, call handler
    • Catching exceptions in the DAL, call translation
  • Application code that still needs a brain 😉
    • Business validations and interruptions: As the business logic demands…

And this is all we have to do with regard to exception handling in our application. I’ll let that speak for itself… .

This post concludes this little series about the state of affairs regarding error handling in an application. I do not claim to have presented something particularly new or surprising. Rather the opposite, this or some similar strategy should be considered best practice. The reason to even start this series was the fact that I regularly encounter code that does exception handling … questionably. And I also regularly met people who look at exception handling only in the scope of the current code fragment, not in light of a broader approach that spans the application.

If you need further motivation to employ proper error handling in the first place, look here.
A little more on error management in long running, asynchronous scenarios can be found here.

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

Advertisements

July 19, 2009

The Rules of the Game

With the last post I tried to establish the problem: The fact that exception handling needs more than try/catch. That it needs a proper exception management strategy, which should be part of the application architecture. And I paid special attention to the paradox demand that exception management should work when everything else fails.

Now, I cannot provide THE strategy, for there is no such thing. But I may present one feasibly approach that has served me well, quite a few times. I’ll simplify quite a bit, but it is nonetheless a complete approach that may serve as starting point for more complex demands.

OK, let’s play…

To define the playing field: The following is about classical n-Tier ASP.NET web applications, and I have decided to represent user induced issues, like business validation errors, as exceptions.

Some Groundwork

Exceptions are used for different error conditions, but also for non-erroneous situations, which leads to a surprising variety:

  • Exceptions being thrown as result to infrastructure issues. E.g. IOException, SecurityException. They may be caused by invalid configurations, by unavailable infrastructure resources, or for other reasons.
  • Exceptions representing errors reported by external services, like database constraint violations or SOAP exceptions.
  • Exceptions being thrown because the caller did not meet the expectations of the called code, e.g. a violation of a contract. (Usually ArgumentExceptions).
  • Exceptions being thrown because a developer’s lapse. NullReferenceException.
  • Exceptions being thrown to sidestep the current flow of control, e.g. HttpResponse.Redirect() throws a ThreadAbortException to prevent an ASP.NET page from further unnecessary processing.
  • Business validation errors. The business logic decides that – despite proper input validation – something is not quite as it should be. Not exactly an error in a technical sense, but the processing simply cannot go on.
  • User concurrency issues. Like some user wants to update some data that has been changed by someone else in the meantime. And “last one wins” may not be good enough.

Please note that while Basic Rule Number 1 for exception handling (no abuse of exceptions for non-exceptional conditions) is as valid as ever, there are certainly exceptions (no pun intended) to that rule. For example, aborting the request upon redirection certainly makes sense; (ab)using an exception to do so may violate said rule, yet it is far more efficient and convenient than any other alternative. The same may be true for business validation errors. Not exceptional, yet exceptions may be the most convenient way to accommodate them.

This is a (perhaps surprising) variety and may even be incomplete – actually that variety is a major contributor to the complexity of exception handling. And yet, it is only half of the equation. Exceptions have cause and effect, and I found that tackling exception handling from the opposite angle, the effect that is, has certain advantages.

But before we come to that, there’s a question that has to be answered…

What is the purpose, the intention of throwing exceptions? (Seriously!)

This is a question one should ask at least once, even so it may seem a bit superfluous. For the answer does not include some of the usually mentioned suspects: Preventing data inconsistencies? Let’s simplify a bit and state that this is the job of transactions. Gracefully shut down? An error happened, what do I care how the application goes down.

Actually it’s far simpler: An error happened! Something which the developer didn’t anticipate or couldn’t cope with. The primary purpose of exceptions is to communicate that situation. And exception handling in turn it is about notifying everyone involved in the appropriate way, telling them what they are supposed to do now, either to compensate the issue or to prevent it from happening again. In other words: It’s about responsibilities emerging out of this error.

And this is what my strategy revolves around: Responsibilities.

An Exception Handling Strategy

Back to cause and effect, and let’s start by the later one. More to the point, the desired effect, driven by what is actually asked for in case of an error (rather than what could be done). This will lead to a strategy that is more simple, easier to understand, and easier to implement.

As I mentioned ‘responsibilities’ the first question is kind of obvious:

Question #1: Who should be responsible for what?

This is no esoteric question aiming at pieces of architecture or code. Rather it’s asking which person has to shoulder the work (or the blame, if you like):

  1. The Administrator: He has to solve problems caused by infrastructure issues (such as invalid configuration entries, unavailable databases, and so on).
  2. The User himself: Let him deal with errors caused by invalid input, other users working simultaneously with the same data, or in some other way inherent to the intended usage of the product. RTFM!
  3. The developer (YOU!): Anything else falls back on the developer’s desk.

Obviously the last point is the inconvenient part: Every exception defaults to that category, unless the developer explicitly “tells” someone else that he is responsible. Let’s stress that: An error caused by an invalid configuration entry is the developer’s problem – unless he explicitly tells the administrator that he should take care of this.

So, now we are back where we started right at the beginning, at the developer’s desk. The difference is that the developer now has some specific goal: Lay the blame on someone else’s doorstep!

This is actually a very important point. Making someone else responsible is in the developer’s own interest. It helps him avoiding unnecessary and annoying future work. Quite motivating if you ask me.

This makes the next question obvious:

Question #2: How does the developer tell the user or the administrator that he is to take responsibility?

The answer is simple: Provide the necessary feedback, e.g. a message box or an event log entry; in more detail:

  • In case of a user error you tell him with a message box, a popup, within some message area or on a special “you did wrong!” page.
    You don’t tell the admin, because he doesn’t care for typos. Neither do you as developer.
  • In case of an infrastructure issue you show a standard message to the user, telling him to call the admin and come back later. You don’t want to include details, because they might include security relevant information, such as the always cited database connection string.
    Additionally you have to give the admin all information he needs, usually within the event log.

In both cases it’s important that you provide not only the information that an error has occurred (that much is obvious, isn’t it?). You need to include information on how to solve it. Otherwise it will eventually again be up to you, the developer, to solve the issue. That’s why it makes sense not to stop at “couldn’t save data” or “data source not available”. Rather invest some work – in your own self-interest – in providing “Customer data could not be save because it is locked by user ‘XYZ’” or “Database not available, connection string:=…”.

  • Remember the last case: In case of technical issues for which the developer is responsible, you still have to tell the user and the admin something. The user should again get some standard error page (not the ASP.NET error page!), the admin should get an event log entry that tells him to notify the developer. This entry should include as much information as possible for a post mortem diagnosis.

So, essentially this paragraph revolved around presenting an exception, at the UI level where the information leaves the area of your application code. It’s quite simple to implement that, because you can rely on centralized features: ASP.NET provides a global error handler that can be used to provide standard event log entries. Presenting user errors on a page should be boilerplate as well, the only distinction being whether the current page and data is still valid (just show a message box) or not (redirect to some business error page). Nice, easy, and no rocket science.

Of course in order to do this we need a means of distinguishing those three cases. For this, all you need is respective exception classes, two actually. One BlameTheUserException and another BlameTheAdminException. You don’t need a BlameTheDeveloperException, because he is already to blame for every other exception, no need to stress that ;-).

Question #3: Who to put exceptions to work… ?

If the developer’s goal is to blame someone else, then this goal can guide him answering the more detailed questions, such as, where in his code should he be catching exceptions (where not)? When should he throw exceptions (when not)? What about logging? And so on. Everything else follows as consequence…

First about throwing exceptions. Yes, not only the system throws exceptions; you may do the same, and no need to be shy about it. Sprinkle your methods with guard calls throwing argument exceptions, they imply coding errors on the callers side. If the code allows certain conditions that are logically impossible, guard yourself against that impossibility. Throwing exceptions in exceptional cases is no problem at all. Actually it is by far the preferred way to react this harshly, rather than obscuring erroneous conditions because of falsely understood modesty. Fail early, fail often!

Handling exceptions, as in solving the problem once and for ever is of course possible. If it is possible – which is in my experience rarely the case. (It had to be mentioned, though.)

Enriching exception information is a more frequent task. Whenever it makes sense for the post mortem diagnosis, you should add context information to the exception. Adding the actual SQL statement and the connection string to a database error, adding the user name and the accessed resource to security exceptions. This is valuable, if not necessary, to diagnose the root cause of the problem. It’s not even necessary to wrap the exception to do that, since the exception class has a dictionary Data for this very purpose.

Promoting exceptions: Once your code is able to decide that a certain error is an administration or user problem, you should wrap it in the respective class, thus “promoting” it.
This sentence implies the fact that not every location is suited to make that decision. The database layer should probably never decide whether a particular error is the consequence of some erroneous user action; this usually depends on the business context in which it is called, and should be decided there. Infrastructure issues on the other hand happen very low in the call chain and couldn’t even be decided further up.
The real work here lies in actually putting in the necessary handlers. However they tend to gather in classes that deal with the outside world, the database, configuration, some service, or whatever. They also tend to be very boilerplate.

That’s it. … Wait! … That’s not possible!

Question #4: Can that really be all?

Actually this is it. No “every method should…”. No “layer specific exceptions…”. No “log it just in case…”. Actually whenever I tell people how to implement an exception handling strategy in an existing application, the majority of the work is removing exception handlers. Sometimes life can be so easy.

In case it escaped you: If the developer doesn’t have to do something to employ proper exception management, he can’t do anything wrong. We don’t have to rely on him to properly handle his own bugs. We just solved a paradoxon!

Going further, I even consider it an issue by itself if you did too much exception handling!

For example “shouldn’t we at least log the error, just in case?” No, you shouldn’t. The global error handler is responsible for doing that, so there is no need to log it. Worse, if the calling code decided to handle the situation and carry on, there is by definition no application issue at all. And yet the event log would show one, waking the admins for no reason, or flooding the event log and obscuring the one important entry.

So what? That’s a strategy?

Yes it is. Check the demands the initial questions if you like, they’re all accounted for. Take some file that doesn’t exist. Either you did nothing to anticipate that situation, then an IOException will ensue. Neither user nor admin are taking care of this, so it’s back on your desk. Or you did handle it and threw a BlameTheAdminException.

Of course I simplified quite a bit, but not at the expense of the concept, only in terms of features. For example you may need to have different reactions for the user, some errors calling for a message box, other calling for a redirect. Similarly some unavailable service may allow retrying rather than requiring aborting the whole application. Or you may need additional exception classes to handle them in code. Anyway, nothing of this affects the strategy as such.

Once you’ve established such a strategy, every decision you make on the code level becomes easier, and more confident. And what’s more, in my experience it streamlines error handling, makes it more robust, and frequently simplifies the code by removing unnecessary exception handling. Because quite often if someone asks for exception handling, the answer is “already taken care of, nothing to do”.

Getting back to baseball, every player on the field now knows exactly what he is supposed to do. Some do all the work, most do nothing most of the time… .

PS: The ending anecdote: http://leedumond.com/blog/the-greatest-exception-handling-wtf-of-all-time/

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

July 12, 2009

Playing Ball With Exceptions

Ever came across an ASP.NET error page because of a missing configuration entry? Been faced with the question “shouldn’t we at least catch the exception and log it? Just to make sure?” or some statement like “you should catch every exception and throw a new one”? Stumbled over the notorious catch-all or a class whose methods all do exception handling a bit differently? Well, those are symptoms of a missing error management strategy.

It’s actually quite frustrating how often that happens. What’s more, many developers don’t even notice that there is something amiss; they’re quite content dealing with exception on a method by method basis, without taking the bigger picture into account. Or they are very well aware that there is something amiss, but they don’t get it. They know they have to do something, so they set out to do a bit here, a bit there, but not with a clear goal in mind, nothing consistent, only aiming to treat the symptoms and quite regularly overdoing it.

I really don’t remember what first brought the analogy with baseball, of all things, to my mind. Perhaps it was the notion that ‘try’, ‘catch’ and ‘throw’ could be outcries during some ball game, combined with my lack of understanding of baseball (a bit stretching, but that’s how mind twists work). Perhaps it was the idea that many developers do exception handling like they would play baseball if they had never heard of the game.

Stretching or not, that analogy is something I can work with. (And my apologies to everyone who loves and understands that game for everything that follows! ;-))

A Quick Evaluation of … Baseball.

Baseball, like all games has rules. Rules that put the various players into roles they have to fulfill. Most importantly, rules you have to learn if you want to play along. Some are fairly basic “mechanics” of the game: The pitcher throws in the general direction of the batter and the catcher does what his title implies (or so wikipedia tells me).
Some rules are obvious by looking at the game: If the batter hits that ball, he drops the bat and runs, while those guys on the field are trying to catch the ball.
Some rules are rather convoluted and you may have to study some booklet to grasp them; like what the hell is an inning and how do they determine who is winning?

That’s about as much as I understand about baseball. But then, would I want to play baseball, I knew I had a lot to learn before even thinking of entering the field. What rules apply to pitching. What roles do those guys on the field have, what are their responsibilities. How to score points. When to move on to the next round. When does the game end. Under what conditions do the rules change. What is considered foul play and when does it lead to aborting the game.

And this is a surprisingly similar situation to exception handling… . Except that many who play that game play it as if there isn’t even the need for rules, much less asked for them (or have been told about them – which is telling something, too).

Back to Exceptions

The mechanics of exceptions revolve around try/catch/throw and the implied syntax, exception hierarchies, and understanding the runtime behavior. Knowing that much can be expected from every developer and is usually not an issue at all. It is however only just the mechanics, and merely a starting point to proper exception handling.

In the same way the current situation in a game does play a role in baseball (e.g. the batter’s decisions on whether to play safely or to take some risk), the location and situation of a particular code fragment plays a role in the decision on what to do with regard to exceptions. Dealing with them only in the scope of the current code fragment is simply too shortsighted. One has to ask about the overall application strategy to exception handling and how that respective fragment fits into that strategy.

The caveat is: You have to have such a strategy. And defining that strategy should be part of coming up with an application architecture. Yes, exception handling is an architectural topic, not a coding issue.

Exceptions vs. Errors vs. Complexity

As I said, exception handling is an aspect of static and dynamic application architecture, just like security, transaction handling, state management, whatever. But there’s one thing that sets exception handling apart, and that’s dealing with the unexpected, the contradiction of having to deal with what you didn’t think of in the first place.

Exceptions come into play under three different conditions:

  • First, as a means to break the current control flow, say to abort a request upon redirection to another page. This is intended behavior in a regular situation, not even an error.
  • Secondly in cases that disrupt the logically correct operation, but you actually planned for that disruption. Say you rely on a file to exist but you know that someone might screw up, so you handled that case, despite the fact that during regular operations it shouldn’t happen at all. This is an error, but an intended behavior in an expected irregular situation.
  • The third condition is the worst of all: Something unintended happens, something you cannot do anything about. It may arise as ArgumentException (telling you that the developer screwed up) or may be as TypeLoadException (telling you that the environment is on quicksand). In any case this is a purely unintended error and the only thing you can assume about the state of your application afterwards is, that it is not safe to assume anything.

An application architecture should provide provide rules on how to handle these conditions. Add questions like how to deal with the consequences, or deal breaker like parallelism, asynchronicity, or loose coupling, as well as reasonable demands regarding data consistency and error compensation… . And this becomes quite a complex task.

The paradoxical situation however is: Complexity is the very source of errors. Adding complexity is like pouring gasoline into the fire. As developers we already have to deal with the complexity of the problem domain. An exception management strategy should kick in if we fail in that regard. And if he is already doing something wrong, how can we rely on him to do something other (that isn’t even in his line of thinking) right? We can’t!

So, a good exception handling strategy needs to solve that contradiction: The need to deal with different exception conditions on one side. Making things easy for the developers, at the same time not relying on them to do things correctly on the other. Not a small task, indeed. And perhaps this is the very reason that exception handling is usually dealt with so poorly.

Stay Tuned…

OK, this was a lengthy introduction and the actual story is even longer. However I thought it prudent to point out the bigger problem and generally raise the awareness for the topic. Of course I also enjoyed playing with that baseball analogy… ;-). Anyway, in the interest of some readers not interested in such musings I split this one in two distinct posts. This one to explain the problem and make you curious, the next one to provide a solution. Stay tuned if I caught your attention.

PS: Special thanks to Karl for the permission to use his pictures…

PPS: There’s actually more to say about baseball than just baseball. Have a look here, German readers may find a more thorough explanation here. And no, it’s got nothing to do with software development 😉

That’s all for now folks,
AJ.NET

kick it on DotNetKicks.com

July 8, 2009

new CloudApp()

Filed under: Azure, SDX, Software Architecture — ajdotnet @ 7:46 pm

Microsoft hasn’t even released Azure yet, still they felt compelled to start a little developer competition. And SDX is participating… .

During the last 4 weeks a team of my colleagues has put together a working application, simulating an actual business scenario. Starting at point zero, coming up with a meaningful idea – in terms of a cloud solution, and putting together a working application, is quite some achievement. Congrats, guys.

Now business scenario sounds a little drab, and banking, especially related to credits, may not actually cause people to applaud hilariously these days. But frankly, since the winner is decided by a community voting and we never thought about that application as particularly “sexy”, we never aimed for that goal. Our aim was to demo the potential of the cloud. Nothing less.

And that’s what the application does. It takes big amounts of data (storage), runs advanced statistical calculations (Monte Carlo simulations) (compute), and delivers the results. Since these are peek time usages, we may use several worker roles to scale out the calculations, cutting back when they are not needed (dynamic provisioning). Exactly what the cloud is about.

And what I really do like is that it doesn’t stop there. Service Bus may be leveraged to integrate this application with “local” solutions, calling back when the calculation has finished. SQL Data Services (once they become available with reporting) may be used for longtime data mining and trend analysis. One can even think of producing information needed to calculate a single credit contract and using Live Mesh to distribute it among the sales people.

So, check it out: http://ratingsimulator.cloudapp.net/  

That’s all for now folks,
AJ.NET

Blog at WordPress.com.