AJ's blog

July 26, 2008

LINQ 2 SQL — Less is More (More or Less)

Filed under: .NET, .NET Framework, C#, LINQ, Software Architecture, Software Development — ajdotnet @ 2:21 pm

Some time ago I promised some additional thoughts about how LINQ 2 SQL affects (web) application architectures. Well, here we go…

There are two defining characteristics of LINQ 2 SQL that may affect the architecture of a web application:

  1. The DataContext class, being “the main entry point for the LINQ to SQL framework.” (as the documentation states.)
  2. The way code generation works.

While the first point affects what features of LINQ 2 SQL we are able to use in web applications, the second one affects the application architecture.

Note: if you need to catch up with LINQ 2 SQL have a look at the my earlier post.

DataContext Effects

In this post I’ll focus on the DataContext class generated by LINQ 2 SQL.

The DataContext class is not only an anchor to access the tables, views, and stored procedures. It also offers more interesting features like change tracking, maintaining object identities, optimistic concurrency, and deferred loading. Consequently this is what most people I talked to attributed to LINQ 2 SQL: The possibility of reading some data, manipulating it, and call SubmitChanges, while LINQ 2 SQL handles all the gory details of mapping the data, efficiently talking to the database (including load on demand), dealing with concurrent changes, and so on.

This read/manipulate/submit approach is what the documentation describes as being “designed to last for one ‘unit of work’.” And in case the consequence escaped you: This this implies a statefull programming model. The very DataContext instance that was used to read the data has to be available until every potential change or deferred access has been settled.

This does generally not agree with web applications, as read/manipulate/submit regularly spans several requests: The data is read, presented to the user, and then the request dies, and in a stateless scenario the DataContext as well. The next request comes in, say an update request, but there is no DataContext available, carrying information about the data that was read before. In order to work with the laid out programming model, one would have to create a new DataContext, reread the data, and then apply the changes and submit them. With optimistic locking being the first victim, because for that the information from the previous request is mandatory. And delayed loading degenerates to “optional loading”, because either you need the data (no benefit in delaying then) or you don’t.

How about maintaining the DataContext instance across requests? No way either; since it is not serializable, you can’t put it into the session. (And that’s just a technical reason, whether one should do that even if it were possible is a different matter altogether.)

To make a long story short, let me rephrase and emphasize the first sentence:
For web applications the generated DataContext class is only an anchor to access the tables, views, and stored procedures.

BTW: Services are frequently bound to be stateless, too. Thus the same reasoning applies to them as well. Additionally they face another problem: They hand out the entities and get fresh ones from the outside world via serialization. Thereby they are removed from the DataContext, which is kind of unforgiving.

Less is More

Now we have concluded that the statefull approach doesn’t work. Reverting to a stateless employment of LINQ 2 SQL however leaves out a lot and raises the question whether there is something left at all. Well, there certainly is. We still have code generation for entity classes (based on tables and views) and for stored procedures and functions.

May web applications are really quite simple in that they are mainly “browsers over a (one) data store”. The complexity of those kind of applications lies more in sophisticated presentation rather than coding the business logic. In this scenario, and to really benefit from the productiveness of LINQ 2 SQL, I have had good experience using LINQ 2 SQL with just two simple tweaks:

  • Limited number of result set types (i.e. the set of columns):
    Rather than having my SELECTs (e.g. in stored procedures) produce different projections (i.e. tailored to the particular stored procedures task), I let them produce the same set of columns if they work on the same table. This way I will get a Customer entity and respective instances rather than a CustomerWithAddress, CustomerWithRating, CustomerWithName, CustomerWithAllInfo, … . And if I get that customer, I can always be sure it’s not some half filled entity.
  • Explicit result set types:
    It’s quite common to work on JOINed data. LINQ 2 SQL handles this quite well and will figure out the entity type even for stored procedures as “auto-generated type”. In order to make this type explicit in the database I would define a view containing the JOIN and have the stored procedure work against this view. And of course I use the view to generate a respective entity with LINQ 2 SQL. 
  • I also refrain from having stored procedures return more than one result set or varying result set types depending on some condition. (I don’t consider this to be a tweak, because I would do that anyway.)

Of course the database may need some tuning, as some of these tweaks may hurt the performance. But optimization when optimization is due… . And while I like the stored procedure approach, those tweaks will also work with free statements in code.

These tweaks are not meant as rules but as guidelines. Feel free to break them at leisure 😉 , their intention is merrily to support the LINQ 2 SQL code generation. However… I experienced that the define-your-entities-and-the-operations-on-top approach caused more OO-like thinking than the traditional define-your-operations-and-let-them-return-the-result-at-hand approach.

So, after dragging and dropping the tables, views, and stored procedures LINQ 2 SQL will happily generated the respective classes and methods. Getting type safe data and DB access was never easier.

If you are in the “use CRUD statements to access the database” camp, the usual round trip for an update would be: read the data that is to be changed, change it, submit changes. Like this:

using (MyDataContext dc = new MyDataContext())
    var customer = (from c in dc.Customer where c.ID == id select c).First();
    customer.Name = “new name”;

Compared to ADO.NET this still eases db access a lot.

If you follow the “use stored procedures to access the database” approach it gets even simpler:

using (MyDataContext dc = new MyDataContext())
    dc.ChangeCustomerName(id, name);

This is kind of “anti-sophisticated” but quite efficient. And it allows to do concurrency in stored procedures if you maintain the version field on the client and provide it during the update call. I found this to be much more concise and better suited to the way web applications work.

One word of warning (or rather a disclaimer if you like): This works for me. And it works for a certain kind of web applications. And it only works because I’m willing to trade some academic architectural demands for the KISS principle, productivity, and maintainability. So please don’t follow this advice blindly.

Oh, by the way: I used this approach in a project using an Oracle database as well. Writing a simple code generator that read the system tables and generated wrappers around stored procedures was no rocket science, the only caveat was dealing with REF CURSORs.

The next post will take a closer look at LINQ 2 SQL code generation and how it affects the architecture.

That’s all for now folks,

kick it on DotNetKicks.com

July 13, 2008

Working for SDX (a.k.a. My latest project…)

Filed under: Miscellaneous, SDX, Software Developers — ajdotnet @ 4:04 pm

Advertisement warning!

I’ve said it in an earlier post: I like working for SDX. In fact my work has become even more challenging due to a changed job description (with the unfortunate side effect that I have less time to spend on this blog). Anyway, we are looking for people (don’t look at the date of the post; if you are reading this in a year the statement will still be true). Well, here’s a short description of my latest project to give you an idea on some of the work we do for our customers…

I just completed a pilot study for one of our major customers, running about 3/4 of a year. Starting point was a system consisting of a non-trivial database and of two VB6/ActiveX/DHTML applications. Initially created a decade ago as replacement for a simple MS Access application, it had grown organically into a multi-purpose-application and a very vital, central data pool for “quite a few” stakeholders.

Since Microsoft has decommissioned VB6, the customer decided to do a VB6-to-.NET migration. My job as responsible architect was designing the target architecture, and coming up with a migration path that supported parallel operations of old and new world during the anticipated two years time the migration would take.

Of course in a decade the world has changed, so has the business and the way software has to be built to honor that. Globalization, European Union, changed customer’s perceptions, constantly changing demands, the way software is built, maintained, and operated.
So the job quickly grew to include a new domain model, adjusted business processes, better business alignment. So far the demands.

The agreed on strategy we came up with is an ecosystem of applications. One core application implements the main parts of a new domain data model, ensures consistency and — essential in any distributed system and surprisingly difficult to achieve at times — provides stable identifiers. Circled around this core application is a number of other mutually independent applications. (The possibility to do this of course relies on the business requirements being mutually independent.)

This approach is driven by

  • the separation of the monolith in as many as possible disparate applications (pardon the exaggeration, but it is a fair number),
  • defined ways of interaction between these applications (including off-the-shelf-software),
  • and providing a sound basis to support building the custom-made family members.

This strategy provides the flexibility to build each application according to demand of specific stakeholders, even allowing for contradicting business demands of different departments. The set of applications can change in the future to address changed needs. Future technology changes can be applied in small steps (lessening the chance that will build the next Moloch). Maintenance should get better, because each application is less complex (simpler architecture, less business functions). And it conforms with Roger Sessions works on complexity of software systems.

Of course, a small but rigid set of rules and governance has to be implemented to ensure peaceful coexistence and avoid “chaotic growth” of our application family in the future. Especially the growth of interdependencies between the applications has to be watched carefully.

Next steps will be applying for the budget at the top management and then hopefully building that system. Of course with my and SDX participation 😉 .

Now, if that sounds like a challenging and interesting project; if you happen to be willing to work in the Rhein/Main area, Germany; if you could envision yourself working for the same company as I do 😉 … (not necessarily in that project!). Well then pay a look at http://www.sdx-ag.de/jobs.aspx (German language skills mandatory!).

That’s all for now folks,

July 6, 2008

Rants and rails

Something is going on out there…

While I still work on LINQ 2 SQL some people started an open discussion, rant, or rail (depending on one’s view and taken offense) about the upcoming Entity Framework.

In all frankness, an “ADO .NET Entity Framework Vote of No Confidence” (VoNC from now on, see here) is certainly capable to triggering off some very strong reactions.

What I disagree with is the way in which it’s presented. The language of “Vote of No Confidence” is intended to start a fight and is long hand for “You suck”. […]

I’m not upset that people are giving feedback. I want them to give feedback. But here’s the deal – do it in a grown up way. My way or the highway doesn’t sit well with anyone on the receiving end. There’s a WIDE path between being constructively critical and downright offensive.

(Josh Holmes, Ranting and Raving or Getting Results?)

Also quite funny: While the VoNC has probably been popularized by Mary-Jo Foley, she instantly got her … feedback …. from one of the VoNC’s authors:

The signatories list represents leading developers, architects, and practice leads within the Microsoft community. The representation of the signatories as testers is misleading and inaccurate.

We would appreciate an immediate retraction and correction of this misrepresentation.

(Scott Bellware, MVP, see here)

Seems people got a little testy. Anyway, Scott also points out “the history of the feedback and relationship between the community of experts and the Entity Framework team going back to April of 2007”, which raises the question why they still had to publish the VoNC in the first place, despite that relationship.

Given all that, Microsoft’s response was a fine example of “cooling things down” and trying to focus in the facts. Well, essentially Tim more or less tells us that all the VoNC asked for will be kind of there in EF v2.

But already the VoNC is experiencing string headwind. Most is aimed at the, well, call it appearance, of the VoNC and a nice listing can be found here. There is even an “An Open Letter to the ALT.NET Community” (see here).

My opinion?

Thought I would spare you? Silly you 😉

What instantly caught my attention, if just because it was the first point made in the VoNC, was the statement that “entities are more significantly behavioral objects from the perspective of entity-oriented applications” and the claim that this is a “key architectural enabler and distinction”.

Having some experience with EJB those statements vividly reminded me of entity beans. I have expressed my opinion on the notion of “intelligent data objects” (for lack of a better term) more than half a decade ago:

My opinion is that the support of statefull components (i.e. entity beans) in J2EE is one of its biggest weaknesses. Not because it’s technically bad but it leads to wrong architectural design decisions […]
(microsoft.public.dotnet.general, 4/9/2001, see here)

The reasons for this opinion in 2001 (scalability, reliability, etc.) are still valid. Additionally there are a bunch of other reasons: Trends to lightweight architectures have prospered in Java and even affected .NET (back then, DataSets were the means of choice, today entity classes are very common). The Service notion put emphasis on data shape and operations on that data. Business process engines move data from system to system, shaping and reshaping it along the way. In the distributed, sometimes disconnected world that our enterprise environments have become I could build cases against behavioral objects around security, maintenance, outsourcing scenarios, versioning, testing (which is ironically so favored in the VoNC), software complexity, future safety, and so on.

The VoNC seems to imply that if you don’t have “behavioral objects from the perspective of entity-oriented applications”, then you are stuck with “data objects from the perspective of data storage and data storage technologies” (as I read it more or less simple mappings from database tables to objects). Well, correct me if I’m wrong, but data can be reshaped without putting behavior in it. That’s exactly what I expect from an ORM tool like EF.

I’ll stop before I get carried away. Much to my own relief, I’m not the only one having these reservations, though: Ian noted the similarity with EJB as well and got deeper in this topic. Ward makes a good point on the issues of “Persistence Ignorance” which I share and thus won’t reiterate here.

Where are we?

As I see it, the ongoing discussions are circling around the issue on whether entities should follow the data centric pattern of value objects, or whether they should be complemented with behavior to become business objects. That is largely a matter of taste.

On the other hand, the discussion is somewhat misleading. The topic has got nothing to do with domain models, which in itself is agnostic of those two approaches. And the VoNC’s reference to the anemic domain model (i.e. a domain model based on value objects) only muddies the water. The very verdict that this is an anti-pattern, and a horrific one at that, is due to a judgment by Martin Fowler and only justified by the fact that it violates OO principles (which is obvious) and people think (yes, they do, according to Martin) that value objects are real objects. No wonder the VoNC calls this a “degraded architecture”.

So, two opposing forces, standing in muddied water, and Microsoft caught between them. The PDC is going to be interesting… 😀

BTW: Please note that I never said, Microsoft should not do what the VoNC asks for. There are bright people among the signatories and they should not be ignored. Microsoft as tool vendor should provide a tool than can be used in any architecture.

That’s all for now folks,

Blog at WordPress.com.