MathML-in-HTML5

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

MathML-in-HTML5

Roger B. Sidje
I am currently driving an effort to enable MathML-in-HTML (apart from  
MathML-in-XHTML that we already support). I have a patch that serves  
the dual purpose of showing where things are going and the issues to  
ponder about.

Here is a
[screenshot] https://bugzilla.mozilla.org/attachment.cgi?id=239771
which is a _live_ rendering of this testcase:
[mathml-in-html] https://bugzilla.mozilla.org/attachment.cgi?id=239769

Those interested in following this up can see bug 353926:
https://bugzilla.mozilla.org/show_bug.cgi?id=353926

Quick background:
=================

At the Firefox engineering meeting in Mountain Views (last December  
2005), I pleaded that we enable MathML in HTML5 to advance the cause  
of MathML, which is so far locked in a XHTML/XML world that does not  
seem to be going anywhere in terms of display content as opposed to  
data (witness the WHATWG effort -- http://www.whatwg.org). Those to  
whom I spoke included dbaron, hixie and sicking, and they welcomed the  
suggestion, asking for a broader discussion. Hixie raised the caveat  
that MathML elements should still remain in the MathML namespace. He  
e-mailed me a while ago about a discussion on this matter in the  
WHATWG mailing list, which can be seen here
http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-June/thread.html.

That discussion is however too broad and involves tangential issues such as
inventing another syntax, etc. My original take was simply to enable
MathML+HTML, in the same vein as we have MathML+XHTML. I think MathML  
is suffering from having to fight the battle for adoption of XHTML as  
well. As a niche technology, it does not have the means to be engaging  
a fight. What it simply needs is MathML-in-HTML. W3C failed to  
recognise that it could retrofit MathML in HTML -- see this archived  
post for some insight:
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source
But HTML5 being shepherded by WHATWG could provide the right framework  
from this to happen now.

I have finally been able to code this up (while keeping MathML  
elements in the MathML namespace). I attached the patch I had so far  
in bug 353926.

Design & Technical issues:
==========================

How does MathML-in-HTML5 work?

We support MathML-in-HTML5 when these two conditions are met:

  1. The DOCTYPE of the document says so. If yes, we enable
     MathML entities (TODO) and flag mMayHaveMathML in the HTML content sink.

  2. And either a) OR b) is met:

     a) <html> has the MathML namespace as the value of an attribute with a
        prefix, e.g., <html xmlns:m="http://www.w3.org/1998/Math/MathML">.

        In this case, we cache the prefix "m" in mMathMLNameSpacePrefix,
        and we intercept all <m:tag> in the document and create
        MathML content nodes for them.

     b) MathML fragments are in the document as
        <math xmlns="http://www.w3.org/1998/Math/MathML">
          ...
        </math>

        In this case, we intercept all non-HTML elements inside the <math> tag
        and create MathML content nodes for them.

Issues:
  1. Tag soup: we understand that we are exposing ourselves to this.

  2. a) What about CSS matching rules? From the Style System point of view,
        the document is still HTML, but <m:math> is in the MathML namespace. We
        might have to special case MathML-in-HTML5 in the Style System as well.

     b) The second option raises an issue with HTML-in-MathML, e.g.,
        <math xmlns="http://www.w3.org/1998/Math/MathML">
          <b>bold</b>
        </math>
        We don't intercept the <b> in this case. Hence, even though it is
        HTML-in-MathML without an explicit XHTML namespace for <b>,  
the HTML sink
        will give <b> a HTML content node. This is not really XHTML friendly.
        On the other hand, we don't want to be an XML parser either... These
        are conflicting objectives. We need to decide what to do. We may agree
        to only support tags with prefixes as in a), or also keep b) knowing
        that it has this XHTML unfriendly behavior.
---
RBS

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
On Sat, 23 Sep 2006 [hidden email] wrote:
>
> Hixie raised the caveat that MathML elements should still remain in the
> MathML namespace.

I meant in the DOM, I didn't mean in the markup. I don't think we should
have any namespace declarations or namespace prefixes in text/html; I
would just have the HTML parser always support the MathML elements, in
the same way that it supports any random unknown element today, except
that when it sees a MathML element it puts it into the MathML namespace in
the DOM rather than the XHTML namespace.

I really don't think we want to introduce namespace prefixes or namespace
declarations into tag soup. I think that would be a big mistake.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

L. David Baron
On Saturday 2006-09-23 21:06 +0000, Ian Hickson wrote:

> On Sat, 23 Sep 2006 [hidden email] wrote:
> >
> > Hixie raised the caveat that MathML elements should still remain in the
> > MathML namespace.
>
> I meant in the DOM, I didn't mean in the markup. I don't think we should
> have any namespace declarations or namespace prefixes in text/html; I
> would just have the HTML parser always support the MathML elements, in
> the same way that it supports any random unknown element today, except
> that when it sees a MathML element it puts it into the MathML namespace in
> the DOM rather than the XHTML namespace.
>
> I really don't think we want to introduce namespace prefixes or namespace
> declarations into tag soup. I think that would be a big mistake.
Agreed.

-David

--
L. David Baron                                <URL: http://dbaron.org/ >
           Technical Lead, Layout & CSS, Mozilla Corporation

_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: MathML-in-HTML5

Paul Topping
In reply to this post by Roger B. Sidje
If MathML is considered a subset of HTML5, then no namespace declaration
would be necessary. However, if MathML is going to work in HTML that
isn't declared as HTML5 (not clear to me from this thread), then the
document would be poorly specified without it, IMHO.

At the risk of enciting an anti-Microsoft backlash, I should remind some
on the list that IE has covered this territory before. They already have
a mechanism for declaring XML islands in HTML that seems to work just
fine. Of course, Mozilla won't be interested in duplicating IE's way of
associating a plugin as the renderer of the namespace in the document.
IMHO, it doesn't belong there anyway. It is better (ie, more secure) to
keep such associations out of the content.

Paul Topping
Design Science, Inc.
www.dessci.com/mathplayer

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf
> Of Ian Hickson
> Sent: Saturday, September 23, 2006 2:06 PM
> To: [hidden email]
> Cc: [hidden email];
> [hidden email]
> Subject: Re: MathML-in-HTML5
>
> On Sat, 23 Sep 2006 [hidden email] wrote:
> >
> > Hixie raised the caveat that MathML elements should still remain in
> > the MathML namespace.
>
> I meant in the DOM, I didn't mean in the markup. I don't
> think we should have any namespace declarations or namespace
> prefixes in text/html; I would just have the HTML parser
> always support the MathML elements, in the same way that it
> supports any random unknown element today, except that when
> it sees a MathML element it puts it into the MathML namespace
> in the DOM rather than the XHTML namespace.
>
> I really don't think we want to introduce namespace prefixes
> or namespace declarations into tag soup. I think that would
> be a big mistake.
>
> --
> Ian Hickson               U+1047E                
> )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \  
> _\  ;`._ ,.
> Things that are impossible just take longer.  
> `._.-(,_..'--(,_..'`-.;.'
> _______________________________________________
> dev-tech-mathml mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

RE: MathML-in-HTML5

Paul Topping
In reply to this post by Roger B. Sidje
And, I should have added that without a namespace declaration there
would be no way to differentiate different versions of MathML. While
most MathML instances are now MathML 2.0, the MathML 3.0 effort is just
now starting up.

Paul Topping
Design Science, Inc.
www.dessci.com/mathplayer

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf
> Of Paul Topping
> Sent: Saturday, September 23, 2006 3:39 PM
> To: Ian Hickson; [hidden email]
> Cc: [hidden email];
> [hidden email]
> Subject: RE: MathML-in-HTML5
>
> If MathML is considered a subset of HTML5, then no namespace
> declaration would be necessary. However, if MathML is going
> to work in HTML that isn't declared as HTML5 (not clear to me
> from this thread), then the document would be poorly
> specified without it, IMHO.
>
> At the risk of enciting an anti-Microsoft backlash, I should
> remind some on the list that IE has covered this territory
> before. They already have a mechanism for declaring XML
> islands in HTML that seems to work just fine. Of course,
> Mozilla won't be interested in duplicating IE's way of
> associating a plugin as the renderer of the namespace in the document.
> IMHO, it doesn't belong there anyway. It is better (ie, more
> secure) to keep such associations out of the content.
>
> Paul Topping
> Design Science, Inc.
> www.dessci.com/mathplayer
>
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of Ian
> > Hickson
> > Sent: Saturday, September 23, 2006 2:06 PM
> > To: [hidden email]
> > Cc: [hidden email];
> > [hidden email]
> > Subject: Re: MathML-in-HTML5
> >
> > On Sat, 23 Sep 2006 [hidden email] wrote:
> > >
> > > Hixie raised the caveat that MathML elements should still
> remain in
> > > the MathML namespace.
> >
> > I meant in the DOM, I didn't mean in the markup. I don't think we
> > should have any namespace declarations or namespace prefixes in
> > text/html; I would just have the HTML parser always support
> the MathML
> > elements, in the same way that it supports any random
> unknown element
> > today, except that when it sees a MathML element it puts it
> into the
> > MathML namespace in the DOM rather than the XHTML namespace.
> >
> > I really don't think we want to introduce namespace prefixes or
> > namespace declarations into tag soup. I think that would be a big
> > mistake.
> >
> > --
> > Ian Hickson               U+1047E                
> > )\._.,--....,'``.    fL
> > http://ln.hixie.ch/       U+263A                /,   _.. \  
> > _\  ;`._ ,.
> > Things that are impossible just take longer.  
> > `._.-(,_..'--(,_..'`-.;.'
> > _______________________________________________
> > dev-tech-mathml mailing list
> > [hidden email]
> > https://lists.mozilla.org/listinfo/dev-tech-mathml
> >
> _______________________________________________
> dev-tech-mathml mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

RE: MathML-in-HTML5

Ian Hickson
On Sat, 23 Sep 2006, Paul Topping wrote:
>
> If MathML is considered a subset of HTML5, then no namespace declaration
> would be necessary. However, if MathML is going to work in HTML that
> isn't declared as HTML5 (not clear to me from this thread), then the
> document would be poorly specified without it, IMHO.

As far as HTML5 UAs are concerned, declaring HTML as HTML5 consists of
labelling it as text/html. It isn't clear to me what you would consider
HTML that isn't declared as HTML5. With the exception of quirks which are
required for compatibility with de facto standards that disagree with de
jure standards, HTML has no practical versioning story -- all features
work in all documents, regardless of the official "version" of HTML used.


> At the risk of enciting an anti-Microsoft backlash, I should remind some
> on the list that IE has covered this territory before. They already have
> a mechanism for declaring XML islands in HTML that seems to work just
> fine.

XML data islands don't form part of the parent DOM (they are "islands", as
opposed to part of the document). I'm not sure how wrapping <xml> tags
around the MathML content would help. :-)


> And, I should have added that without a namespace declaration there
> would be no way to differentiate different versions of MathML. While
> most MathML instances are now MathML 2.0, the MathML 3.0 effort is just
> now starting up.

Why would you need to distinguish them? MathML2 is a superset of MathML1,
and (for all intents and purposes) any compliant MathML2 UA can process
any compliant MathML1 content. I would assume that this would continue to
be the case; if not, then this is IMHO a problem with MathML3.

Note that the namespace declaration can't currently distinguish between
MathML1 and MathML2, I don't see any reason why MathML3 would change this.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Boris Zbarsky
In reply to this post by Roger B. Sidje
Ian Hickson wrote:
> I meant in the DOM, I didn't mean in the markup. I don't think we should
> have any namespace declarations or namespace prefixes in text/html; I
> would just have the HTML parser always support the MathML elements

I assume we have data that shows there would be no collisions with random
user-defined tag names in random pages?  Including intranets?

> I really don't think we want to introduce namespace prefixes or namespace
> declarations into tag soup. I think that would be a big mistake.

I agree with this, for what it's worth....  But perhaps we do want a way to
explicitly flag tag-soup documents as "this document uses MathML".

-Boris
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
On Sun, 24 Sep 2006, Boris Zbarsky wrote:
>
> Ian Hickson wrote:
> > I meant in the DOM, I didn't mean in the markup. I don't think we should
> > have any namespace declarations or namespace prefixes in text/html; I would
> > just have the HTML parser always support the MathML elements
>
> I assume we have data that shows there would be no collisions with random
> user-defined tag names in random pages?  Including intranets?

Nope. Just blind faith. (Well, we have some evidence for the Web at large,
but nothing substantial, only a billion pages or so. I'm working on a more
substantial survey but that still won't cover the intranets.)

We didn't check that <canvas> wouldn't cause clashes, either.


> > I really don't think we want to introduce namespace prefixes or
> > namespace declarations into tag soup. I think that would be a big
> > mistake.
>
> I agree with this, for what it's worth....  But perhaps we do want a way
> to explicitly flag tag-soup documents as "this document uses MathML".

I don't see why. We don't want a flag for when people can use the storage
APIs. Or when they can use <img> elements. Or whatever.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Boris Zbarsky
In reply to this post by Boris Zbarsky
Ian Hickson wrote:
> We didn't check that <canvas> wouldn't cause clashes, either.

I see.  I had assumed that we in fact had.

> I don't see why. We don't want a flag for when people can use the storage
> APIs. Or when they can use <img> elements. Or whatever.

True, because those are very unlikely to collide with random stuff the pages are
doing (e.g. the storage APIs are using fairly long names that are unlikely to
collide with page-defined functions and variables).

If we think MathML has a similarly low risk of collision, great.

-Boris
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

David Carlisle
In reply to this post by Ian Hickson

Ian
> XML data islands don't form part of the parent DOM (they are "islands", as
> opposed to part of the document). I'm not sure how wrapping <xml> tags
> around the MathML content would help. :-)

The syntax Paul was referring to here wasn't the <xml> convention, but
the ability in IE to have (explicitly prefixed) XML elements within an
HTML document with rendering controlled by an external component,
but _without_ any other flag at that point in the in the markup, such as
<xml> or <object> etc.

In the IE implementation you need to have an <object> in the head
pointing at the particular rendering component, which is fairly horrible
and also, you need to declare the namespace using (a variant of) an
early working draft namespace syntax using a PI, but as Paul said, those
parts needn't be copied. an example of a document using this syntax is
shown here:

http://www.dessci.com/en/products/mathplayer/author/creatingpages.htm#AnatomyMathPlayerWebPage

By using a different classid you can do the same thing to include
(explicitly prefixed) svg into an htm document and have it rendered by
Adobe's svg viewer, and in principle any other vocabularies (although I
don't personally know of any other implementations of this, except
techexplorer, which is again for MathML).

I'm not sure, having math more or less added directly to html would  be
nice in many ways but I'm not sure how well it scales, if you think
people might want to have html+svg+chemml+... then perhaps having an api
that allows processing to be attached to namespaced elements would be
more general. On the other hand that was part of the reason for having
namespaces (and for that matter, xml itself) that people could serve all
sorts of different xml vocabularies and have clients do whatever is
necessary. I suspect part of the reason for "html5" is a feeling that
that never happened and isn't going to be mainstream any time soon, and
that a solution that directly addresses the fixed html vocabulary, with
perhaps two specific extensions such as svg and mathml will in practice
cover the vast majority of browser needs, and other vocabularies can be
transformed to html+.. before being served.

David
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
On Mon, 25 Sep 2006, David Carlisle wrote:
>
> The syntax Paul was referring to here wasn't the <xml> convention, but
> the ability in IE to have (explicitly prefixed) XML elements within an
> HTML document with rendering controlled by an external component, but
> _without_ any other flag at that point in the in the markup, such as
> <xml> or <object> etc.

Oh, well, as noted earlier, the idea of namespace prefixes in HTML isn't
one that I personally am particularly fond of.


> I suspect part of the reason for "html5" is a feeling that that never
> happened and isn't going to be mainstream any time soon, and that a
> solution that directly addresses the fixed html vocabulary, with perhaps
> two specific extensions such as svg and mathml will in practice cover
> the vast majority of browser needs, and other vocabularies can be
> transformed to html+.. before being served.

I think that's pretty much exactly correct, yes.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
In reply to this post by Boris Zbarsky
On Sun, 24 Sep 2006, Boris Zbarsky wrote:

>
> Ian Hickson wrote:
> > We didn't check that <canvas> wouldn't cause clashes, either.
>
> I see.  I had assumed that we in fact had.
>
> > I don't see why. We don't want a flag for when people can use the storage
> > APIs. Or when they can use <img> elements. Or whatever.
>
> True, because those are very unlikely to collide with random stuff the pages
> are doing (e.g. the storage APIs are using fairly long names that are unlikely
> to collide with page-defined functions and variables).
>
> If we think MathML has a similarly low risk of collision, great.

I don't know about "we".

What I would be proposing for HTML5 is just the following list of
elements:

   math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
   mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
   mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

...and of those only <math> came up at in the top 1000 elements in my
search of elements on about one billion pages.

According to that same research, <math> is, on the Web, less frequent than
the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>,
<lable>, <text-spez>, etc. It was present on less than 0.002% of the pages
the research covered. (To give an idea of scale, <h8> is used on more than
0.003%, so if we avoid <math> because of this, we should probably
introduce <h7> and <h8> into HTML, since we're saying that's an important
enough level to worry about.)

Now, of course, it could be that those 0.002% of pages are all hugely
important and that we'll break the Web in adding this feature. We can't
know until we've tried.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Boris Zbarsky
In reply to this post by Boris Zbarsky
Ian Hickson wrote:
> According to that same research, <math> is, on the Web, less frequent than
> the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>,
> <lable>, <text-spez>, etc. It was present on less than 0.002% of the pages
> the research covered. (To give an idea of scale, <h8> is used on more than
> 0.003%, so if we avoid <math> because of this, we should probably
> introduce <h7> and <h8> into HTML, since we're saying that's an important
> enough level to worry about.)

The last statement doesn't follow, for what it's worth.  There's a difference
between "introduce support for tags that currently do nothing in all UAs" and
"don't introduce support for a tag that other UAs do nothing for because it will
make us behave differently from those UAs".

I do agree that it sounds like this won't be too big an issue.

-Boris
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Roger B. Sidje
In reply to this post by Ian Hickson
 > What I would be proposing for HTML5 is just the following list of
 > elements:
 >
 >    math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
 >    mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
 >    mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

I don't like mlabeledtr very much (I have already expressed my views
about it to folks of the MathML WG), and would hope that they will take
my suggestion for <mtr label="..."> in MathML3. The former is
unnecessarily bloated and doesn't degrade gracefully at all with
renderers that don't support it (not to mention that it is hard to fit
in Gecko's existing table code).

However, your list misses some key tags, in particular leaf tags such as
<mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
<none/> are needed in <mmultiscripts> (albeit it can be argued that
<none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
differentiation is worthwhile).

In general, I would prefer the list to at least include all the tags
that we already support, and which existing webpages have come to depend
on. This effectively boils down to your list above, excluding
<mlabeledtr>, and including <mspace/>, <mprescripts/>, <none/> and
<mi>, <mn>, <ms>, <mtext>, <mo>. In particular, <mo> is a vital tag as
it is at the heart of those stretchy MathML characters.

Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
beginning of tag soup, it may be that the HTML parser would have to have
some knowledge of leaf tags, so that for example, a stray <mspace>
doesn't become the root of an entire HTML tree... which is later fed to
the hapless MathML engine. (The patch I attached in bug 353926 ignored
the issue.)
---
RBS

On 26/09/2006 3:59 AM, Ian Hickson wrote:

> On Sun, 24 Sep 2006, Boris Zbarsky wrote:
>
>>Ian Hickson wrote:
>>
>>>We didn't check that <canvas> wouldn't cause clashes, either.
>>
>>I see.  I had assumed that we in fact had.
>>
>>
>>>I don't see why. We don't want a flag for when people can use the storage
>>>APIs. Or when they can use <img> elements. Or whatever.
>>
>>True, because those are very unlikely to collide with random stuff the pages
>>are doing (e.g. the storage APIs are using fairly long names that are unlikely
>>to collide with page-defined functions and variables).
>>
>>If we think MathML has a similarly low risk of collision, great.
>
>
> I don't know about "we".
>
> What I would be proposing for HTML5 is just the following list of
> elements:
>
>    math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
>    mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
>    mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction
>
> ...and of those only <math> came up at in the top 1000 elements in my
> search of elements on about one billion pages.
>
> According to that same research, <math> is, on the Web, less frequent than
> the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>,
> <lable>, <text-spez>, etc. It was present on less than 0.002% of the pages
> the research covered. (To give an idea of scale, <h8> is used on more than
> 0.003%, so if we avoid <math> because of this, we should probably
> introduce <h7> and <h8> into HTML, since we're saying that's an important
> enough level to worry about.)
>
> Now, of course, it could be that those 0.002% of pages are all hugely
> important and that we'll break the Web in adding this feature. We can't
> know until we've tried.
>
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG), and would hope that they will take
> my suggestion for <mtr label="..."> in MathML3. The former is
> unnecessarily bloated and doesn't degrade gracefully at all with
> renderers that don't support it (not to mention that it is hard to fit
> in Gecko's existing table code).

I'm happy to drop/add any tag to this list. Just give me the list you
want.


> However, your list misses some key tags, in particular leaf tags such as
> <mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
> <none/> are needed in <mmultiscripts> (albeit it can be argued that
> <none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
> differentiation is worthwhile).

I missed anything that wasn't in the table I happened upon in the spec. I
didn't look very closely for the exact table I wanted.

Tell me what tags you want to have and we'll make that the list. You're
the expert. :-)


> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
> beginning of tag soup, it may be that the HTML parser would have to have
> some knowledge of leaf tags, so that for example, a stray <mspace>
> doesn't become the root of an entire HTML tree... which is later fed to
> the hapless MathML engine. (The patch I attached in bug 353926 ignored
> the issue.)

Don't worry, these tags auto-close when a parent tag is closed.

   <foo><bar><baz></foo><quux>

...results in this DOM:

   <foo>
     <bar>
       <baz>
   <quux>

For leaf nodes with following siblings, people will have to use end tags,
as in:

   <foo><bar></bar><baz></baz></foo><quux></quux>

If we want to start adding actual leaf tags, I'd rather do this in a
second stage, after we have a proof of concept. (I've so far avoided
adding any new tags to the HTML5 parser spec, but eventually there will be
a bunch we have to add.)

We can go from non-empty to empty much more easily than from empty to
non-empty.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Roger B. Sidje
On 27/09/2006 10:16 AM, Ian Hickson wrote:

> I'm happy to drop/add any tag to this list. Just give me the list you
> want.

OK.

> For leaf nodes with following siblings, people will have to use end tags,
> as in:
>
>    <foo><bar></bar><baz></baz></foo><quux></quux>
>
> If we want to start adding actual leaf tags, I'd rather do this in a
> second stage, after we have a proof of concept. (I've so far avoided
> adding any new tags to the HTML5 parser spec, but eventually there will be
> a bunch we have to add.)

OK, I see.

The other issue are those 2000 entities that MathML has. You said that
you are not a big fan of a namespace thingy on the root <html> element.

Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
W3C entities _by default_? We have a proof-of-concept of that in View
Selection Source, BTW. It will display any entity it can.
http://lxr.mozilla.org/mozilla/source/content/base/public/nsIDocumentEncoder.idl#125
As VSS has underwent the test of time without major complaints, perhaps
<!DOCTYPE html> could assume that too? If that is agreed, we are all clear.

The other remaining issue might be with style matching because <math>
will then be internally in the MathML namespace whereas the HTML
document is in the none namespace (at present), but we will see how it
goes from there.
---
RBS

_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> The other issue are those 2000 entities that MathML has.

Yeah... Do we really need those? Some of them seem reasonable to add, but
2000 seems like too many for the mnemonic advantage to beat just using
Unicode codepoints...

The problem with adding entities is that a LOT of people do things like

   href="/u?aa=foo&ab=foo&ac=foo&ad=foo"

...which today works, but would break if MathML entities were introduced
(since &ac is a MathML entity).


> Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
> W3C entities _by default_?

Don't do anything based on the DOCTYPE. HTML5 is anything sent as
text/html.


> The other remaining issue might be with style matching because <math>
> will then be internally in the MathML namespace whereas the HTML
> document is in the none namespace (at present), but we will see how it
> goes from there.

I don't see why this would cause any problems.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Roger B. Sidje
On 27/09/2006 11:23 AM, Ian Hickson wrote:
>
> The problem with adding entities is that a LOT of people do things like
>
>    href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
>
> ...which today works, but would break if MathML entities were introduced
> (since &ac is a MathML entity).
>

That list is so big that trying to hand-pick some and leaving some out
would need another committee...

>>Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
>>W3C entities _by default_?
>
>
> Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> text/html.

I thought the DOCTYPE was trustworthy -- based on this excerpt from the
HTML5 spec:

"HTML documents that use the new features described in this
specification must start with the string <!DOCTYPE html> and, if they
are served over the wire (e.g. by HTTP) must be labelled with the
text/html MIME type."

If so, it would have meant less conflicts with agreed entities in HTML5.

BTW, for my own information, do you intent HTML5 to be transitional,
almost-standards, or strict? If it is HTML5 (or XHTML5) served as
text/html but put in the XHTML namespace at some later stage (as the
HTML5 implies), it better be strict, no? And that would be driven by the
DOCTYPE detection code. Catch my drift? Or is tag soup going to be in
the XHTML namespace?

If it is strict then maybe entities could be required to have a
semi-colon -- which will then avoid the ambiguities you mentioned above.

Not that I have a position on this (at least as yet). I am just bringing
in some food for thoughts, to accommodate the realistic issues of MathML.
---
RBS
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

Ian Hickson
On Wed, 27 Sep 2006, Roger B. Sidje wrote:

> On 27/09/2006 11:23 AM, Ian Hickson wrote:
> >
> > The problem with adding entities is that a LOT of people do things
> > like
> >
> >    href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
> >
> > ...which today works, but would break if MathML entities were
> > introduced (since &ac is a MathML entity).
>
> That list is so big that trying to hand-pick some and leaving some out
> would need another committee...

Not really... I say we just add ApplyFunction, InvisibleComma, and
InvisibleTimes (but not their short aliases).


> > > Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting
> > > all W3C entities _by default_?
> >
> > Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> > text/html.
>
> I thought the DOCTYPE was trustworthy -- based on this excerpt from the
> HTML5 spec:
>
> "HTML documents that use the new features described in this
> specification must start with the string <!DOCTYPE html> and, if they
> are served over the wire (e.g. by HTTP) must be labelled with the
> text/html MIME type."

That's an authoring conformance requirement, and has no bearing on
implementations.


> BTW, for my own information, do you intent HTML5 to be transitional,
> almost-standards, or strict?

HTML5 documents starting with <!DOCTYPE HTML> must be in standards mode.
Documents with other DOCTYPEs or no DOCTYPE at all may be in another mode,
as already described in the spec. In due course I may specify quirks mode
and then there'll just be the spec, and no other modes.


> If it is HTML5 (or XHTML5) served as text/html but put in the XHTML
> namespace at some later stage (as the HTML5 implies), it better be
> strict, no? And that would be driven by the DOCTYPE detection code.
> Catch my drift? Or is tag soup going to be in the XHTML namespace?

Not sure what you mean my that. All HTML DOM nodes are (per HTML5) in the
XHTML namespace, irrespective of the standards/quirks thing.


> If it is strict then maybe entities could be required to have a
> semi-colon -- which will then avoid the ambiguities you mentioned above.

That would break back-compat.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout
Reply | Threaded
Open this post in threaded view
|

Re: MathML-in-HTML5

boards (Bugzilla)
In reply to this post by Ian Hickson
On Tuesday 26 September 2006 08:23 pm, Ian Hickson wrote:

> On Wed, 27 Sep 2006, Roger B. Sidje wrote:
> > The other issue are those 2000 entities that MathML has.
>
> Yeah... Do we really need those? Some of them seem reasonable to add,
> but 2000 seems like too many for the mnemonic advantage to beat just
> using Unicode codepoints...
>
> The problem with adding entities is that a LOT of people do things
> like
>
>    href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
>
> ...which today works, but would break if MathML entities were
> introduced (since &ac is a MathML entity).
>
Oh, I've seen this problem before; when people would link to Image
Shack, part of the URL contained "&image=foo".  Of course, it looked
odd seeing the "&image" part become the imaginary part character (kinda
looks like a dragon), but the URL still worked.  This is why I
encourage usage of the semicolon instead...
--
Matt Sicker

_______________________________________________
dev-tech-layout mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-layout

attachment0 (198 bytes) Download Attachment
12