Unicode: What character encoding does SM use ?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Unicode: What character encoding does SM use ?

Davi Leal-2
Hi,

About the "JS string type implementation", I have read at js/src/jsstr.h that
"A JS string is a counted array of unicode characters".  According to the
ECMA-262 specification, the Unicode character encoding must be UTF-16.

We have the JavaScript SpiderMonkey C Engine js-1.5-rc4a.tar.gz embedded in
our application.  What character encoding is our SpiderMonkey using?.


That is the function stack we get across the SpiderMonkey source code. At the
end we get a "text/html; charset=iso-8859-1" string. I would like to know why
that iso-8859-1 charset is set:

 1-------------------------------------------
  JS_CallFunctionValue
  (
     JSContext *cx = 0x8170478,
     JSObject *obj = 0x82de078,
     jsval fval    = 137224320,
     uintN argc    = 0,
     jsval *argv   = 0x8231c80,
     jsval *rval   = 0xbefffa54
  )
 2-------------------------------------------
  js_InternalCall
  (
     JSContext *cx = 0x8170478,
     JSObject *obj = 0x82de078,
     jsval fval    = 137224320,
     argc          = 0,
     argv          = 0x8231c80,
     rval          = 0xbefffa54
  )
 3-------------------------------------------
  js_Invoke
  (
     JSContext *cx = 0x8170478,
     argc          = 0,
     JSINVOKE_INTERNAL
  )
 4-------------------------------------------
  js_Interpret
  (
     JSContext *cx = 0x8170478,
     &v            = -2147483647
  )
 5-------------------------------------------
  js_Invoke
  (
     JSContext *cx = 0x8170478,
     argc          = 1,
     0
  )
 6-------------------------------------------
  JSStackFrame *fp, frame;
  frame.argv = sp - argc;  // <----- Here is where we first see the value

  native
  (
     JSContext *cx = 0x8170478,
     JSObject *frame.thisp = 0x82de078,
     argc          = 1,
     jsval *frame.argv = 0x828f4c8
     &frame.rval   = -2147483647
  )
 7-------------------------------------------
  JS_GetStringBytes
  (
     JS_ValueToString
     (
        JSContext *cx = 0x8170478,
        argv[0]       = 0x828f4c8
     )
  )




The JS_GetStringBytes function returns the string
  "text/html; charset=iso-8859-1"
Where does SpiderMonkey set that value?.

From where does it get the "text/html" part, and from where does it get the
"charset=iso-8859-1" path?. Does it get from the default system locale and
encoding?.  Is the JSContext or the JSRuntime related to it?.

Note that the application which has the SpiderMonkey embedded does not use the
setlocale()  function, so I think the  POSIX locale  is the default.

What SpiderMonkey's functions manage the decoding and encoding of JS Strings
to others character encodings?.

Regards,
David

_______________________________________________
mozilla-jseng mailing list
[hidden email]
http://mail.mozilla.org/listinfo/mozilla-jseng