Replacement of charsetalias.properties file.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Replacement of charsetalias.properties file.

Jay-145
Dear Developers,

Could you tell me how the charset alias works now?

Let me give you some background why I wanted to know. Previously, I
have written a small utility that runs on Mac OS X which tinkers the
charsetalias.properties file a bit to handle many "big5-HKSCS" encoded
web sites that declared themselves incorrectly as "big5". It simply
replaces the target encoding of big5 with big5-HKSCS instead. As big5-
HKSCS is a superset of big5, it shows big5 encoded site correctly as
well as misconfigured big5-HKSCS sites. I have just downloaded Firefox
4.0 beta 7 and discovered that the charsetalias.properties file is no
longer there. Could you tell me how the charset handling works in
Gecko 2.0?

My little utility can be downloaded from here:
http://www.macupdate.com/info.php/id/19216/i-speak-cantonese
_______________________________________________
dev-i18n mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-i18n
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacement of charsetalias.properties file.

Axel Hecht
On 14.11.10 15:29, Jay wrote:

> Dear Developers,
>
> Could you tell me how the charset alias works now?
>
> Let me give you some background why I wanted to know. Previously, I
> have written a small utility that runs on Mac OS X which tinkers the
> charsetalias.properties file a bit to handle many "big5-HKSCS" encoded
> web sites that declared themselves incorrectly as "big5". It simply
> replaces the target encoding of big5 with big5-HKSCS instead. As big5-
> HKSCS is a superset of big5, it shows big5 encoded site correctly as
> well as misconfigured big5-HKSCS sites. I have just downloaded Firefox
> 4.0 beta 7 and discovered that the charsetalias.properties file is no
> longer there. Could you tell me how the charset handling works in
> Gecko 2.0?
>
> My little utility can be downloaded from here:
> http://www.macupdate.com/info.php/id/19216/i-speak-cantonese

That moved into the compiled code,
https://bugzilla.mozilla.org/show_bug.cgi?id=563536.

No idea if there's anything left that allows you to tweak it.

That said, is there a bug filed on what you're trying to fix? Add-ons
like yours sound like something we shouldn't need.

Axel
_______________________________________________
dev-i18n mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-i18n
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacement of charsetalias.properties file.

Jean-Marc Desperrier-4
Axel Hecht wrote:
> That moved into the compiled code,
> https://bugzilla.mozilla.org/show_bug.cgi?id=563536.
>
> No idea if there's anything left that allows you to tweak it.
>
> That said, is there a bug filed on what you're trying to fix? Add-ons
> like yours sound like something we shouldn't need.

It's not a great idea to have hard-coded those identifiers :-(

It's not just a matter of something that mozilla got wrong that's just
need to be fixed. There will be some identifier that about nobody uses,
that's it's a nonsense to include by defaut, and there will be a few
case where things have been done wrong, so that it's useful to be able
to override the default, even if the defaut is correct.

big5 is a case of a situation that can't quite be satisfyingly solved
with a hard coded solution. http://en.wikipedia.org/wiki/Big5 list at
least 10 different extensions to big5 that possibly could have been
truly used when a page is tagged as big5.

Of those HKSCS is probably the only one that's a real standard and has
currently a large usage. But what's if someone wants a warning when some
of the characters are not truly big5 and use the hkscs extension ? (a
taiwanese for exemple for whom hkscs is not a standard) And what if he
is handling some old content, that's truly HKSCS incompatible, where
he'd love to have big5 interpreted as something else than HKSCS ?

Having the identifiers defined in a ressource file that can be
overwritten to change them makes the situation significantly easier.
_______________________________________________
dev-i18n mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-i18n
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacement of charsetalias.properties file.

Jean-Marc Desperrier-2
On 19/11/2010 15:07, Jean-Marc Desperrier wrote:
> Of those HKSCS is probably the only one that's a real standard and has
> currently a large usage. But what's if someone wants a warning when some
> of the characters are not truly big5 and use the hkscs extension ? [...]
>
> Having the identifiers defined in a ressource file that can be
> overwritten to change them makes the situation significantly easier.

I've found another case that's a lot more pertinent : Proprietary
extensions to the japanese S-JIS encoding for emoji characters.

There appear to be currently 3 such extensions, DoCoMo, KDDI and
SoftBank. Basically every japanese cell network has it's own proprietary
way of encoding emoji, documented in
http://www.unicode.org/Public/UNIDATA/EmojiSources.txt

As those emoji characters now are encoded inside the unicode standard,
those extension can't simply be ignored.
_______________________________________________
dev-i18n mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-i18n
Loading...