Note: This is part of a series:
1. Talking WF (introduction)
more to come (communication, hosting, error management).
Let’s skip the “how do I reference the necessary assemblies” stuff and assume the boilerplate work of setting up the WF engine and starting workflows has been done. (There’s plenty documenatation about this available elsewhere.) So let’s asume we have a page that starts workflow instances and the workflow is in place and running. Now, when you start thinking about a workflows you‘ll sooner or … well, even sooner, ask questions like „How dows this piece of information get to my workflow instance?“ and „How does that piece get out?“. In one word: data exchange.
Let‘s make that more tangible with an example: Let’s asume I‘m the project manager and I just got a notice that the guy across the room wanted to go on vacation when I planned going-live. Damn his insolence! How do I cancel his request?
I start up Vacations@SDX, select the page that gives me a list of open vacation requests, and click the „cancel vacation“ button for the one in question. The page refreshes and the state of the vacation request becomes „canceled for undue insolence“ (Oh boy, if any of that happend, my collegues would kill me. But nastiness is so much fun ).
There a two things that happend here that are noteworthy:
- I saw a list of open vacation requests. And each request should be represented by a running workflow instance. Did I see a list of workflow instances? Workflow instance state management.
- I clicked a button and somehow the workflow took over and did something. As you know, this happens asynchronously, nevertheless I saw the correct state after the processing instantly. To make that work, the web application has to have a means to know when the state has changed. Workflow communication.
Workflow instance state management
The workflow instance has state. It knows about data somehow coming in, knows what activities have been executed and so forth. Therefore it would be possible to query all workflows, load them, and ask about certain details, e.g. read some properties. Wait, the workflow instance might process something asynchronously… . How do you synchronize? What if the workflow just terminated because you loaded it and it somehow finished? Do you really want to load thousands of workflow instances for every query? (OK, that‘s probably beyond the example application, but still…).
Once you‘ve thought along that line you‘ll realize that you need to maintain the business state (the vacation request in our example) outside of the workflow instance, i.e. in a database table. But now you have to face the problem of keeping the data in this table and the related workflow instance in sync. Here you have two feasible options:
- Keep the workflow instance oblivious of that table, let the web application read and write into the table (kind of a cache).
- Let the workflow maintain its state in the table directly, let your web application read the table.
Some people might vote for option 1, and with good reason. It keeps the business logic in one place, the web application; you don‘t have to deal with technically concurrent database access in semantically single „transactions“). It clearly limits the workflow instance to workflow stuff, which promotes separation-of-concerns.
All well and good. But I go for number 2. For two reasons: Number 1 would imply more communication and synchronization between workflow instance and web application (we‘ll look into that). And the main reason: If my workflow is long running, it might change its state (say to „timeout because no one cared“) at a time when no user is online and the web application is fast asleep. With number 1 some other piece of code would have to do what the web application already does.
The pattern we came to appreciate goes like this:
- Mirror the workflow state in a table (containing the state and the workflow instance ID)
- Let the workflow maintain that state
- Let the web application query that state
- Provide cleanup, e.g. remove the respective row when the workflow finishes.
- Ensure consistency (e.g. put in validation code that verifies that the information within the table is consistent with the workflow state, provide some diagnostic tool that checks whether the workflow instances still exists)
This way it should be far easier to provide overview lists. And to interact with workflow instances from your web application you would need a table with workflow IDs anyway.
The next post will look into the workflow communication topic, the actual data exchange with the workflow instance.