[ADC-Ext 1.0.5] Locale Extension

Locked
darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

[ADC-Ext 1.0.5] Locale Extension

Post by darkKlor » 11 Nov 2009, 14:21

With the move to UTF-8, internationalisation was a clear goal of the ADC spec. It is now easy for any language to be used within a hub, and there are many clients which support localisation. However, a void exists when it comes to messages sent from the hub and it's bots. For example, a hub may be an English language hub, but the user may be a native Arabic speaker, and have a GUI in Arabic. The hub must assume that all users are native English speakers though, and all messages it sends, and user commands (UCMD) that it gives will be in English.

UCMD in particular is an issue, because it modifies the user's GUI. This may result in the situation where an English command which should be displayed left-to-right, is instead displayed right-to-left on the Arabic user's client because that is the default of labels in their client. There are other more subtle things, like a non-English client containing both English and non-English labels in the GUI, which result in inconsistancy.

To alleviate this issue, it is desirable for the client to notify the hub of the user's locale, so that messages may be modified for the culture of the user if the hub is able to (e.g. if it holds a translation for the particular language).

The suggested method is for each client to add an LC parameter to the INF message with a value being one of the BCP47 language-country pairs e.g. en-US - English (United States), de-CH - German (Switzerland), fr-FR - French (France).

Examples include:
INF LCen-US
INF LCde-CH
INF LCfr-FR

The hub may also include an LC parameter in it's INF message to indicate it's default locale i.e. the locale which will be used if the client's locale is not supported. If the hub does not directly support the client's locale, it should attempt to fall back to the same language group (e.g. hub supports en-US but not en-AU, so falls back to en-US), and if this is not available, then fall back to the hub's own locale.

Notes:
- the standards suggest the language be in lowercase, and the country be in uppercase.
- a dash '-' and underscore '_' are both acceptable seperators.
- the country code may be more than two characters.

Reference:
BCP47: http://tools.ietf.org/html/bcp47

Sulan
Junior Member
Posts: 16
Joined: 19 Jan 2009, 20:33

Re: Locale Extension

Post by Sulan » 11 Nov 2009, 14:51

I think this is a great idea.
Image

Pretorian
Site Admin
Posts: 214
Joined: 21 Jul 2009, 10:21

Re: Locale Extension

Post by Pretorian » 11 Nov 2009, 18:46

The following is what I intend on writing in the main spec.
The language that is desired or is primarly used. Specified as language code and possibly country code, separated by either dash (-) or underscore (_). Language code shall be in lower case and country code in upper case. Implementations should fall back to appropriate country code and language code if they do not support the specified country code and/or language code.

poy
Member
Posts: 78
Joined: 26 Nov 2008, 17:04

Re: Locale Extension

Post by poy » 12 Nov 2009, 13:50

Pretorian wrote:The following is what I intend on writing in the main spec.
The language that is desired or is primarly used. Specified as language code and possibly country code, separated by either dash (-) or underscore (_). Language code shall be in lower case and country code in upper case. Implementations should fall back to appropriate country code and language code if they do not support the specified country code and/or language code.
there can be more information than just language and sub-language (i've seen the part after the dash called "sub-language", not "country", so far) in a locale, eg sometimes there are locales with a dot and info after the dot, or with an @ and info after the @:
az_AZ@latin
az_AZ@cyrillic
es_ES@modern

also the special case of the "C" code (no locale choosen; use default) should be watched out for.

to sum up i believe it would be pretentious to try to re-define here in a P2P protocol what makes up a locale identifier; instead, we should reference an external publication on the matter and indicate that locale codes have to be correct with regards to it.

as an aside, i believe a hub is made up of people who are able to communicate with each other and therefore all use a language they understand. i fail to see how useful it can be to mix several languages within the same hub...

darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

Re: Locale Extension

Post by darkKlor » 12 Nov 2009, 15:24

poy wrote:as an aside, i believe a hub is made up of people who are able to communicate with each other and therefore all use a language they understand. i fail to see how useful it can be to mix several languages within the same hub...
I'm going to assume you did not read this....
darkklor wrote:UCMD in particular is an issue, because it modifies the user's GUI. This may result in the situation where an English command which should be displayed left-to-right, is instead displayed right-to-left on the Arabic user's client because that is the default of labels in their client. There are other more subtle things, like a non-English client containing both English and non-English labels in the GUI, which result in inconsistancy.
poy wrote:to sum up i believe it would be pretentious to try to re-define here in a P2P protocol what makes up a locale identifier
and....
darkklor wrote: a value being one of the BCP47 language-country pairs
+
darkklor wrote:Reference: BCP47: http://tools.ietf.org/html/bcp47
:? :? :?

poy
Member
Posts: 78
Joined: 26 Nov 2008, 17:04

Re: Locale Extension

Post by poy » 12 Nov 2009, 17:17

no i guess i must have only read Pretorian's summary! :oops:

the RFC looks like the one used in HTTP; i quickly searched for "affinity" and "q=" in it but didn't find a match, so i guess these are specific to HTTP's implementation, perhaps we could use them too? so an user could have several LC fields with different affinities...

Pretorian
Site Admin
Posts: 214
Joined: 21 Jul 2009, 10:21

Re: Locale Extension

Post by Pretorian » 12 Nov 2009, 18:08

poy: The standard is HUGE for something this simple. The reason for making it more simple also means that it can be put in (whatever future) BASE. The simple scheme I proposed means implementators don't need to pull in huge libraries just to support this, but can do very basic string parsing.

darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

Re: Locale Extension

Post by darkKlor » 12 Nov 2009, 22:46

In the interests of not pointing at Wikipedia, I didn't link this before, but yes it is used in HTTP/HTML/XML et al http://en.wikipedia.org/wiki/BCP_47. The leading alternative is the newer and bigger Common Locale Data Repository from the Unicode Consortium http://en.wikipedia.org/wiki/CLDR and http://www.unicode.org/reports/tr35/#La ... Locale_IDs. I found them both via this page http://en.wikipedia.org/wiki/Locale.

BCP47 and CLDR are largely the same in the language-country part, but differ more once you get into script and region. Of course it's desirable to be more supportive, but you need to limit it somewhere, and for our purposes I think the language-country bit is a good start.

I would be surprised if there were no libaries out there to support this (especially since I've already googled a couple :P).

Pretorian
Site Admin
Posts: 214
Joined: 21 Jul 2009, 10:21

Re: Locale Extension

Post by Pretorian » 05 Sep 2010, 09:32

Added to ADC-Ext 1.0.5.

Locked