Jump to content

英文维基 | 中文维基 | 日文维基 | 草榴社区

Wikipedia:Rendering math

From Wikipedia, the free encyclopedia
(Redirected from Wikipedia:MathML)

This essay offers a comparison of different encodings and presentation of mathematical formulae. The three principal ones are the <math> tag, raw wiki (or HTML) code, and "texhtml" templates. The <math> and "texhtml" encoding may have different presentations for registered users, depending on user preferences and personal styles.

Comparison of markup encodings

[edit]
Encoding Advantages Disadvantages
<math>LaTeX</math>
  • Well-known and standardized.
  • Portable to/from scientific papers.
  • Clearly distinguishes semantics from appearance.
  • Flexible and can handle all formulae.
  • Easy for other software to process.
  • Requires knowledge of LaTeX markup language.
  • Lacks Unicode support — cannot use Unicode for mathematical operators, [1] or for Cyrillic and other scripts. [2]
  • Unable to place wikilinks on parts of formulae.
Raw wiki or HTML code
  • Encodes appearance rather than its semantics.
  • Difficult to process in software.
  • Unable to handle square roots, vertical fractions, and other common formula types.
  • Proper use (italic variables, protected spaces around operators) requires care and is a frequent source of mistakes.
"texhtml" templates, such as {{math}}
  • Encourages standardized notation.
  • Distinguishes semantics from appearance.
  • More polished appearance than wiki/HTML code.
  • Recognizable (although not easily processed) by software.
  • Proliferation of templates
  • Vulnerable to changes and edit wars.
  • Produces html nesting errors when the surrounding text is italicized (e.g. hatnotes, reference titles, or the {{unsolved}} template), making some other solution necessary in those contexts.
  • Not able to handle some complex formulas.
  • The equal (=) and pipe (|) symbols require special care (the {{=}} and {{!}} templates or other workarounds must be used).

Comparison of presentations

[edit]
Encoding Presentation Advantages Disadvantages
<math> SVG with hidden MathML
(Wikipedia's default)
  • More polished than the PNG below.
  • Images do not match article text in font size.
  • Images do not usually have proper baseline alignment for inline math.
  • Images do not change color when part of a link.
  • Copy-paste as text duplicates the formula's source code, copy-paste as image is not supported in many programs e.g. Word.
PNG
  • Robust.
  • Little overhead in a browser.
Native MathML
  • Robust and standard-compliant.
  • Little overhead in Firefox.
  • Supported by screen readers.
MathJax
  • Works in all browsers.
  • Pretty fonts.
  • Working copy-paste.
  • High server load for font assets.
  • Slow rendering.[3]
  • No longer available as a reader preference. See T99369#1484437 for how to manually inject a script (without using WMF servers).
KaTeX
  • Faster than MathJax.
  • Same advantages as above.
  • Has \operatorname*.
  • Support for commands still incomplete.
  • Has never been available as a user preference. See T99369#6970581 for how to use it anyways.
Raw wiki or HTML code
  • Avoids switching font families in running text.
  • Minimal overhead.
  • Appearance does not depend on user (account) preferences.
  • Does not distinguish a formula from the running text.
  • The default sans-serif may render certain characters indistinguishable, such as 1, I and l.
  • In articles mixing raw wiki with <math> formulae, the appearance of the same variable in the two types of formula does not match (serif vs sans-serif).
{{math}} ('texhtml' class)
  • Distinguishes a formula from the running text.
  • Close match to the appearance of inline <math>.
  • Mixing of font families (sans-serif for English, serif for math), in running text, can be jarring.
  • Reverts to the appearance of raw wiki code on systems that don't support font changes (e.g. the Wikipedia Android app)
  • Not an exact match to <math> formulas in the same article
    • uses Times font, not Computer Modern as in TeX; text generally look thinner.
    • occasional kerning problems due to mixture of italics and roman: f(x); x.M). See {{italics correction}}.
  • Cannot be used within most citation templates
Specific templates
{{mvar}}: x
  • A shorthand for variables like {{math|''x''}}.
  • Clean semantics.
  • Cannot be used for vectors. (Must use {{math|'''v'''}} to get v, {{math|{{vec|''v''}}}} to get v.)
{{sqrt}}: 2
  • Clean semantics.
  • The vinculum is slightly interrupted.
  • Does not look well under {{math}} or so, itself. ({{math|{{sqrt|2}}}} yields: 2.)
  • Discouraged by MOS:RADICAL.
{{radic}}: 32
  • Clean semantics.
  • Same as above. ({{math|{{radic|2|3}}}} yields: 32.)
  • Discouraged by MOS:RADICAL.
{{sfrac}}: 1/2
  • Clean semantics.
  • Encouraged by MOS:FRAC for math and science articles only.
  • Occupies too much vertical space in a running text. Embedded sub- and superscript causes vertical misalignment. (yadda {{math|{{sfrac|1|2}}}} yadda yields: yadda 1/2 yadda.)
{{frac}}: 12
  • Clean semantics.
  • MOS:FRAC discourages this for math and science articles only; for other topics this is often preferred.
  • Semantically distinguishes intervals from other types of formulae.
  • Because the interval endpoints are coded as a single parameter, the semantics is a bit obscure.
Bra–ket notation:
  • Semantically distinguishes bra–ket notation from other types of formula.
  • Avoids complex html coding for angular bracket characters ⟨ &#x27E8; or {{langle}}, ⟩ &#x27E9; or {{rangle}}, and vertical bar | &#124; and prevents incorrect usage of less-than/greater-than signs for these characters.
  • Angular brackets may not render on all browsers.
{{vec}}: A
  • Clean semantics.
  • The arrow is not centered over italicized letters.
  • The arrow is too high over x-height letters. ({{math|{{vec|''v''}}}} yields: v.)
{{intmath}}: +∞
0
  • Clean semantics.
  • Name differs from math/LaTeX coding conventions.
  • Clean semantics.
  • Produces bad spacing when combined with fraction templates.

Pros of HTML

[edit]
  1. Formulas in HTML behave more like regular text. In-line HTML formulae always align properly with the rest of the HTML text and, to some degree, can be copied-and-pasted (this is not a problem if TeX is rendered using MathJax, and the alignment should not be a problem for PNG rendering now that bug 32694 is fixed).
  2. The formula's background and font size match the rest of HTML contents (this can be fixed on TeX formulas by using the commands \pagecolor and \definecolor) and the appearance respects CSS and browser settings while the typeface is conveniently altered to help you identify formulae.
  3. Pages using HTML code for formulae will load faster and they will create less clutter on your hard disk.
  4. Formulae typeset with HTML code will be accessible to client-side script links (a.k.a. scriptlets).
  5. The display of a formula entered using mathematical templates can be conveniently altered by modifying the templates involved; this modification will affect all relevant formulae without any manual intervention.
  6. The HTML code, if entered diligently, will contain all semantic information to transform the equation back to TeX or any other code as needed. It can even contain differences TeX does not normally catch, e.g. {{math|''i''}} for the imaginary unit and {{math|<var>i</var>}} for an arbitrary index variable.
  7. Unlike generated bitmaps, HTML is not sensitive to dots per inch variances between viewing platforms.

Pros of TeX

[edit]
  1. TeX is semantically more precise than HTML.
    1. In TeX, "x" means mathematical variable "", whereas in HTML "x" is generic and somewhat ambiguous.
    2. On the other hand, if you encode the same formula as "{{math|<var>x</var>}}", adding the var tag doesn't affect the visual result x and provides the additional semantic description that x is a variable. This requires diligence and more typing that could make the formula harder to understand as you type it, and provides no help to most readers, but could be worth considering if no other rendering options are available.
  2. One consequence of point 1 is that TeX code can be transformed into HTML, but not vice versa.[4] This means that on the server side we can always transform a formula, based on its complexity and location within the text, user preferences, type of browser, etc. Therefore, where possible, all the benefits of HTML can be retained, together with the benefits of TeX. It is true that the current situation is not ideal, but that is not a good reason to drop information or contents. It is more a reason to help improve the situation.
  3. Another consequence of point 1 is that TeX can be converted to MathML (e.g. by MathJax) for browsers which support it, thus keeping its semantics and allowing the rendering to be better suited for the reader’s graphic device.
  4. TeX is the preferred text formatting language of most professional mathematicians, scientists, and engineers. It is easier to persuade them to contribute if they can write in TeX.
  5. TeX has been specifically designed for typesetting formulae, so input is easier and more natural if you are accustomed to it, and output is more aesthetically pleasing if you focus on a single formula rather than on the whole containing page.
  6. Once a formula is done correctly in TeX, it will render reliably, whereas the success of HTML formulae is somewhat dependent on browsers or versions of browsers. Another aspect of this dependency is fonts: the serif font used for rendering formulae is browser-dependent and it may be missing some important glyphs. While the browser is generally capable to substitute a matching glyph from a different font family, it need not be the case for combined glyphs (compare "" and " ").
  7. When writing in TeX, editors need not worry about whether this or that version of this or that browser supports this or that HTML entity. The burden of these decisions is put on the software. This does not hold for HTML formulae, which can easily end up being rendered wrongly or differently from the editor’s intentions on a different browser.[5]
  8. TeX formulae, by default, render larger and are usually more readable than HTML formulae and are not dependent on client-side browser resources, such as fonts, and so the results are more reliably WYSIWYG.
  9. While TeX does not assist you in finding HTML codes or Unicode values (which you can obtain by viewing the HTML source in your browser), copying and pasting from a TeX PNG image in Wikipedia into simple text will return the LaTeX source.
^ Unless your wikitext follows the style of point 1.2
^ The entity support problem is not limited to mathematical formulae though; it can be easily solved by using the corresponding characters instead of entities, as the character repertoire links do, except for cases where the corresponding glyphs are visually indiscernible (e.g. &ndash; for ‘–’ and &minus; for ‘−’).

In some cases it may be the best choice to use neither TeX nor the HTML substitutes, but instead Unicode or the simple ASCII symbols of a standard keyboard.

Discussions

[edit]

See also

[edit]