AJ's blog

November 15, 2013

ASP.NET MVC I18n – Part 1: Basics

Filed under: .NET, .NET Framework, ASP.NET MVC, HTML5, Internationalization — ajdotnet @ 9:15 pm

When I started the series, I didn’t even plan this particular post. But as I drafted the upcoming posts, I realized, we cannot avoid taking a look at what we are actually dealing with. One should know, what he is talking about. Thus, here’s the necessary theoretical background…

The one thing we need to be aware is that we are dealing with two different contexts: The server part, defined by the .NET Framework, and the client side, defined by HTML et al. These contexts differ in the terms they use, in their customs, and in their technical scope.

Server side…

.NET maintains the necessary information via the CultureInfo class:

“The CultureInfo class specifies a unique name for each culture, based on RFC 4646. The name is a combination of an ISO 639 two-letter lowercase culture code associated with a language and an ISO 3166 two-letter uppercase subculture code associated with a country or region.”

In short, we are talking about "en-US", "de-DE", and so on (ignoring special cases). One distinction is made regarding neutral cultures (associated only with the first part, e.g. "en" and "de"), and specific cultures, associated with the country or region. Still, neutral cultures are still maintained in CultureInfo instances, including information beyond the language. They generally rely on the "major representative" of that language. i.e. Germany for German (sorry Austrians ;-)), and the United Kingdom of Great Britain and Northern Irland  … no wait… that former colony of theirs Zwinkerndes Smiley, for English.

It should be noted, that CultureInfo deals with all aspects regarding regions: It acts as language selector, provides date and time formats, even the calendar is addressed.

Regarding localized content (like strings for labels), .NET uses a system of resources and satellite assemblies, that are accessed via the ResourceManager class, either directly or through generated code. (I will assume that this is basic .NET knowledge and not go into further details about it.)

All in all, a comprehensive and consistent system.

Client side…

HTML traditionally only addresses languages (not date or number formats):

“The lang attribute’s value is a language code that identifies a natural language spoken, written, or otherwise used for the communication of information among people.”

http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1.1 

HTML is also far more open in regard to how a language is identified. This includes, but goes beyond what .NET supports:

“Here are some sample language codes:

  • "en": English
  • "en-US": the U.S. version of English.
  • "en-cockney": the Cockney version of English.
  • "i-navajo": the Navajo language spoken by some Native Americans.
  • "x-klingon": The primary tag "x" indicates an experimental language tag”

http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1.1 

However, the focus of HTML is also limited to languages, to the point of actively ignoring any other localization demand:

“The golden rule when creating language tags is to keep the tag as short as possible. Avoid region, script or other subtags except where they add useful distinguishing information. For instance, use ja for Japanese and not ja-JP, unless there is a particular reason that you need to say that this is Japanese as spoken in Japan, rather than elsewhere.”

http://www.w3.org/International/articles/language-tags/

And, indeed, you’ll find that most localized HTML or CSS code you may come across (in samples and documentation) uses two-letter language codes.

Alas, with HTML5 and the new input controls, the focus on "language" is no longer sufficient. A date picker does not only change the weekday names, but also the date format and the first day of the week. The way this issue is addressed by the W3C however seems a bit helpless and places the issue on the browser vendors:

“Browsers are encouraged to use user interfaces that present dates, times, and numbers according to the conventions of either the locale implied by the input element’s language or the user’s preferred locale.”

http://www.w3.org/TR/html5/forms.html#input-impl-notes

Well, a little further down they are refreshingly honest:

“There’s still a risk that the user would end up arriving a month late, of course, but there’s only so much that can be done about such cultural differences…”

BTW: Whenever I wrote HTML, this also included XML, CSS, and HTTP (here and here).

Regarding localized content, HTML only allows denoting the language by the lang attribute. You can mix different languages in one document, but HTML itself does not do anything further. CSS selectors on the other hand can be used to attach styles depending on the language. 

 

Consequences?

For an LOB application, using "language" in the limited sense of HTML is far to shortsighted, thus I will use regions (respective specific cultures on the server, respective tags with language code and country or region), whenever possible. This may seem odd in HTML or CSS, but so what?

And the next post will contain some code, promise.

That’s all for now folks,
AJ.NET

Advertisement

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: