Stanimir Stamenkov wrote:
> How are (should be) file names inside XPI package encoded, are they
> encoded strictly using UTF-8 like in JAR archives, or using no specific
> (but US-ASCII base) encoding as with ZIP files?
Using ZipWiter component should make it as you need. The path strings
are UNIX like. As long as you do not extract the XPI to the file system
you do not need to bother about that. If you extract parts to the file
system the path of the extracted stuff should be ASCII and not make any
assumptions about case sensitivy because the target file system may have
mixed case sensitivity within a path.
The boot filesytem on my Mac is non case sensitive as this is the
default. But all external volumens are confugured to use the case
sensitive journaled HFS+ file system. So if I mount such a volume to my
Mac the path parts above the mount point are none case sensitive, but
the path parts below the mount point are case sensitive.
So be aware of such mixed situations were some parts of the path are
case sensitove and othere are not and also the differen parts of the
path may support different character encodings, because a mounted NFS
volume may belong to operating system, that has ISO-8859-1 as character
encoding of the file system and not UTF-8. Also this is the case in my
world, where old Linux machines use ISO-8859-1.
So real world is much more complex and less predictable than developers
So a good software design knows about all of this and therefore does not
use any characteres outside ASCII and also does not use anny assumptions
about case sensitivity to prevent unexpected trouble in an unknown
world, because real world is allways an unknown world. Good design known
that real worlds is unknown and kann not be completely investigated to
become a really known world.
dev-tech-xpinstall mailing list
[hidden email] https://lists.mozilla.org/listinfo/dev-tech-xpinstall
Fri, 29 Aug 2008 11:25:28 +0200, /Georg Maaß/:
> Stanimir Stamenkov wrote:
>> How are (should be) file names inside XPI package encoded, are they
>> encoded strictly using UTF-8 like in JAR archives, or using no
>> specific (but US-ASCII base) encoding as with ZIP files?
> So a good software design knows about all of this and therefore does not
> use any characteres outside ASCII and also does not use anny assumptions
> about case sensitivity to prevent unexpected trouble in an unknown
> world, because real world is allways an unknown world.
My question was not about the issues when extracting the files to a
file system but how to interpret/decode the file names when read
directly from the archive. As far as I know the JAR specification
makes that additional specification path names inside JAR archives
are UTF-8 encoded, because Java identifiers (class names) are
Unicode, the least. This way the path names inside the archive are
really platform independent. So my question still remains, is XPI
based on the JAR specification, or just on the original ZIP
specification where path names have no encoding specified therefore
only ASCII is assumed safe?
As far as I'm aware *nix file systems have no file name encoding,
also, and it's up to the software reading it to decode it properly.
I've seen the graphical file managers like Gnome always
encode/decode file names using UTF-8, but then when listed in a
console and the default console encoding is ISO-8859-1 the file
names look garbled. The same garbled result I get when listing the
files from within Java, given the default console encoding is not UTF-8.