Welcome to UnicodeMathML

UnicodeMath is a linear representation of math that often resembles math notation and is easy to enter. For example, a/b is UnicodeMath for ab. It works well in Microsoft desktop apps such as Word, PowerPoint, Outlook, and OneNote but it hasn't been widely available elsewhere. This open-source applet implements UnicodeMath on the web.

Entering equations

You can enter equations in five ways:

  1. Enter UnicodeMath into the input (upper-left) window. The corresponding 2D built-up math displays in the output (upper-right) window and the MathML for it displays below the output window.
  2. Enter Nemeth braille, [La]TeX, or MathML into the input window. If the input window starts with a Unicode braille character (U+2800..U+28FF), Nemeth ASCII braille input is enabled. If the input starts with $, $$, \(, or \[, LaTeX input is enabled. If it starts with <math, MathML input is enabled.
  3. Enter UnicodeMath directly into the output window. This option builds up what you enter automatically, similarly to entry in the Microsoft Office apps. This option is a work in progress.
  4. Click on the Dictate button or type Alt+d, wait for the bell, and dictate the equation in English. You need to have Internet access, and you need to enunciate clearly. This option is also a work in progress but if you get it to work itโ€™s the fastest entry method except for:
  5. Paste MathML into the input or output window.

See and/or hear UnicodeMath in action

Click on the Demo button or type Alt+p in the input window to see UnicodeMath in action! Hit the space bar to pause the demo and hit it again to continue the demo. The arrow keys โ†’ and โ† move to the next/previous equation, respectively. Escape and Alt+p stop the demo. One of the equations has the UnicodeMath 1/2๐œ‹ โˆซ_0^2๐œ‹ โ…†๐œƒ/(๐‘Ž+๐‘ sinโก๐œƒ)=1/โˆš(๐‘Žยฒโˆ’๐‘ยฒ), which builds up to

12๐œ‹โˆซ02๐œ‹๐‘‘๐œƒ๐‘Ž+๐‘sinโก๐œƒ=1๐‘Ž2โˆ’๐‘2

To speak the equations, type the space bar to pause the demo, type Alt+s to speak the current equation, and then type the right arrow key to advance to the next equation. Alternatively, type Alt+Enter to enter the current Examples equation (and advance the Examples equation ID), and type Alt+s to speak the equation. In these ways, you can cycle through the equations speaking each one.

You can click on an example in the Examples gallery to enter it and the following control words enter the UnicodeMath for selected examples (handy for quick entry on smaller screens):

Control Word UnicodeMath
\absvalue |๐‘ฅ|=โ’ธ("ifย "๐‘ฅ>=&0,&๐‘ฅ@"ifย "๐‘ฅ<&0,&-๐‘ฅ)
\Faraday ๐›โจฏ๐„=โˆ’๐œ•๐/๐œ•๐‘ก
\Fourier ๐‘“ฬ‚(๐œ‰)=โˆซ_-โˆž^โˆž ๐‘“(๐‘ฅ)โ…‡^-2๐œ‹โ…ˆ๐‘ฅ๐œ‰ โ…†๐‘ฅ
\integral 1/2๐œ‹ โˆซ_0^2๐œ‹ โ…†๐œƒ/(๐‘Ž+๐‘ sinโก๐œƒ)=1/โˆš(๐‘Žยฒโˆ’๐‘ยฒ)
\integralG โˆซ_-โˆž^โˆž ๐‘’^-๐‘ฅยฒ โ…†๐‘ฅ=โˆš๐œ‹
\limit lim_(๐‘›โ†’โˆž) (1+1/๐‘›)^๐‘›=๐‘’
\plasma ๐‘(๐›พ+๐‘–๐œ”โˆ’๐‘–๐œˆ)=๐‘–/โˆš๐œ‹ โˆซ_โˆ’โˆž^โˆž ๐‘’^(โˆ’(๐œ”โˆ’๐œ”โ€ฒ)^2 /(ฮ”๐œ”)^2)/(๐›พ+๐‘–(๐œ”โ€ฒโˆ’๐œˆ)) โ…†๐œ”โ€ฒ
\quadratic ๐‘ฅ=(โˆ’๐‘ยฑโˆš(๐‘ยฒโˆ’4๐‘Ž๐‘))/2๐‘Ž
\SHO ๐‘ฅฬˆ+2๐›พ๐‘ฅฬ‡+๐œ”ยฒ๐‘ฅ=0
\waveeq ๐‘–โ„ ๐œ•๐œ“(๐‘ฅ,๐‘ก)/๐œ•๐‘ก =[โˆ’โ„ยฒ/2๐‘š ๐œ•ยฒ/๐œ•๐‘ฅยฒ+๐‘‰(๐‘ฅ,๐‘ก)]๐œ“(๐‘ฅ,๐‘ก)

Entering symbols

You can enter a symbol by clicking on the symbol in one of the symbol galleries below the input window. But itโ€™s faster to type the symbolโ€™s LaTeX control word such as \alpha for ฮฑ. After typing two letters, you get a math autocomplete dropdown with possible matches. This lets you enter the selected symbol (the one highlighted in blue) quickly by typing Enter or Tab.

For example, if you type \al, you see

Typing the Enter or Tab key inserts ๐›ผ. If you want a different symbol in the dropdown, you can click on it, or you can use the up/down (โ†‘โ†“) arrow keys to select the symbol you want and type the Enter or Tab key to enter it.

The math autocomplete menu helps you discover a LaTeX control word, and it speeds entry especially for long control words such as those in the dropdown

The symbol dictionary includes some control-word aliases, such as \union for \cup (โˆช), since you might not guess \cup is the LaTeX control word for the union operator โˆช.

Entering math alphanumerics

Unicode has many math styled characters, such as the math fraktur H (โ„Œ). They can be entered by selecting the letter(s) in the input or output windows and clicking on the ๐”„๐”…โ„ญ button or other math-style button. You can also enter a character in the Math styles text box and click on the desired math style button.

Or you can enter the control words for the desired characters. The math-style control words consist of a math-style prefix followed by the unstyled character. For example, the prefix "mbf" (math boldface) defines the bold math style and the control word \mbfH gives a bold H, that is, ๐‡. The math-style prefixes are defined in the table

Math Style Prefix Math Style Prefix
normal mup bold mbf
italic mit bold-italic mbfit
double-struck Bbb bold-fraktur mbffrak
script mscr bold-script mbfscr
fraktur mfrak sans-serif msans
bold-sans-serif mbfsans sans-serif-italic mitsans
sans-serif-bold-italic mbfitsans monospace mtt
chancery mchan roundhand mrhnd
isolated misol initial minit
tailed mtail looped mloop
stretched mstrc

Here roundhand and chancery are two script styles, and isolated, initial, tailed, looped, and stretched are Arabic math styles. Currently the Arabic math styles require the XITS Math font and the chancery and roundhand variants require the STIX Two Math font.

Character code points

Below the input window, thereโ€™s a Unicode codepoint window that displays the codepoints of the input symbols above the symbols. This is particularly useful for comparing two strings that appear to be identical but differ in one or more characters. Both the input and output windows support the Alt+x symbol entry method popular in Microsoft Word, OneNote, and NotePad. (It should be supported in all editors ๐Ÿ˜Š). For example, type 222b Alt+x to insert โˆซ.

Speech, braille, LaTeX, dictation

In addition to generating MathML, you can click on buttons or enter a hot key to

The results for speech, braille and LaTeX are displayed below the input window. Dictation results are shown in the input, output, and MathML windows. Dictation hint: wait for the start beep (else the first word(s) might be missing) and enunciate clearly.

Math display

The math is rendered in the output window either natively or by MathJax according to a setting (click on the โš™๏ธŽ to change it). MathJaxโ€™s typography resembles LaTeXโ€™s. The native rendering is good although not yet as good as LaTeX. But an advantage of the native renderer is that you can edit built-up equations directly in the output window and copy all or part of an equation. If the selection is an insertion point, the whole equation is copied. The only editing feature in the MathJax mode is Ctrl+c, which copies the MathML for the whole equation to the clipboard.

Navigating the app

A mouse or touchpad provides one way to move between and inside the various facilities. Another way is to use the Tab key. Since the app has myriad default Tab stops, users need a Tab hierarchy. The top of the hierarchy has the menu stops Help, Demo, Speak, Braille, TeX, Dictate, and About, followed by the Input and Output windows, Settings, History, math styles, and the symbol galleries. The galleries appear in alphabetical order, Accents, Arrows, Binary, etc. The Tab key navigates these stops in the forward direction, while Shift+Tab navigates in the backward direction. The Enter key activates the current stop's facility. In an activated facility, the left and right arrow keys move between the facility's options. The Enter key then runs the option. For an active symbol gallery, the Enter key inserts the current symbol. For most settings, the Enter key toggles the current option. For menu stops, the Enter key sends the associated hot key. Each change is accompanied by explanatory speech.

UnicodeMath editing

When you type UnicodeMath into the input window, various conversions occur in the input window:

It's easier to type โˆ’> to get โ†’ than \rightarrow, although with math autocomplete you only need to type \ri<โ€‹tab> to get โ†’. Similarly, +โˆ’ is easy for getting ยฑ. Many of these operator pairs are listed in the following table.

Pair Symbol Pair Symbol
+- ยฑ -+ โˆ“
/= โ‰  /~ โ‰
<โ€‹= โ‰ค >= โ‰ฅ
~= โ‰… ~~ โ‰ˆ
:: โˆท := โ‰”
<< โ‰ช >> โ‰ซ
+โˆ’ ยฑ โˆ’+ โˆ“
โˆ’> โ†’ <โˆ’ โ†
!! โ€ผ ... โ€ฆ
โ‰ฏ= โ‰ฑ โ‰ฎ= โ‰ฐ
โŠ€= โชฑ โЁ= โชฒ
โŠ„= โŠˆ โŠ…= โЉ
/< โ‰ฎ /> โ‰ฏ

The combination <โˆ’ gives โ†. If you want to enter an expression like ๐‘Ž < โˆ’๐‘, put a space between the > and the โˆ’. These conversions aren't needed in the input window, but they make the input more readable. They also help in creating good looking UnicodeMath expressions for use in plain-text scenarios.

LaTeX and MathML editing

When you type LaTeX or MathML into the input window, control words for Unicode symbols are autocorrected to the symbols, and various operator pairs are converted to Unicode operators. For example, '$\alpha/=\beta' โ†’ '$๐›ผโ‰ ๐›ฝ'.

To facilitate entry, for LaTeX typing a { also inserts the closing }, and for MathML typing an opening tag also inserts the closing tag. Type Ctrl+โ†’ to bypass a tag.

Editing hot keys

Hot key Function
Ctrl+b Toggle the bold attribute. For example, select ๐‘Ž (U+1D44E), type Ctrl+b and get ๐’‚ (U+1D482) as you can verify in the codepoint window.
Ctrl+c Copy the selected text to the clipboard.
Alt+h Display the help page.
Ctrl+i Toggle the italic attribute. If applied to a math italic character, this changes the character to the UnicodeMath way of representing ordinary text, i.e., put it inside quotes as in select ๐‘Ž, Ctrl+i โ†’ โ€œaโ€.
Alt+m Toggle between displaying 1) UnicodeMath in the input window and MathML below the output window, and 2) MathML in the input window and UnicodeMath below the output window.
Ctrl+v Paste plain text from the clipboard. If the text starts with <math, <m:math, or <mml:math, the text is treated as MathML and builds up.
Ctrl+x Copy the selected text to the clipboard, then delete the selected text.
Ctrl+y Redo
Ctrl+z Undo

Symbol galleries

Unicode has almost all math symbols in use today. The symbol galleries located at the bottom of the web page contain the most common math symbols. You can enter a symbol in a gallery by clicking on it or by typing its control word as described in the Entering symbols section above.

Hovering over a symbol displays information about the symbol, specifically the Unicode code point, name, and block, as well as a LaTeX control word for entering the symbol and the symbol's math class. The symbol's Unicode category is defined in Table 4-4 of the Unicode Standard and the symbol's math class is defined in the comments of MathClass.txt, a file for Unicode Technical Report #25: Unicode Support for Mathematics. For example, hovering a script K (๐’ฆ) displays

Here the category "Lu" stands for upper-case letter and the math class "A" stands for alphabetic.

Output window editing

You can enter equations and edit the built-up display in the output window as shown in this video

This "in-place" editing mimics the math editing experience in desktop Microsoft Word, Outlook, PowerPoint, and OneNote, and in the Windows Calculator. The hot keys listed above work here too, as do the symbol galleries and the math autocomplete menus. The copy hot key, Ctrl+c, copies the MathML for the selected content into the plain-text copy slot, rather than copying the underlying plain text. This enables you to paste built-up math equations into Word and other apps that interpret "plain-text" MathML as MathML rather than as plain text. Note: math autobuildup works with native MathML rendering; if MathJax is active, only Ctrl+c works.

The implementation uses JavaScript to manipulate the MathML in the browser DOM.

Intents

UnicodeMathML generates Presentation MathML 4. A key addition in MathML 4 is the intent attribute, which allows authors to disambiguate math notation and control math speech.

For example, does |๐‘ฅ| mean the absolute value of ๐‘ฅ or the cardinality of ๐‘ฅ? Absolute value is assumed by default since absolute value is more common than cardinality. The default MathML for |x| is

<mrow intent="absolute-value(๐‘ฅ)">
  <mo>|</mo><mi>๐‘ฅ</mi><mo>|</mo></mrow>.

To specify cardinality, enter \card(x) (or โ“’(x)). These inputs produce the MathML

<mrow intent="cardinality(๐‘ฅ)">
  <mo>|</mo><mi>๐‘ฅ</mi><mo>|</mo></mrow>.

If you enter an absolute value or cardinality containing more than one symbol as in |a+b|, the MathML intent contains an argument reference $a. For |a+b|, the MathML is

<mrow intent="absolute-value($a)">
  <mo>|</mo>
    <mrow arg="a">
      <mi>๐‘Ž</mi><mo>+</mo><mi>๐‘</mi></mrow>
  <mo>|</mo></mrow>

A matrix enclosed in vertical bars is treated as a determinant. For example, the UnicodeMath |โ– (a&b@c&d)| builds up to

|๐‘Ž๐‘๐‘๐‘‘|

which has the MathML

<mrow intent="determinant($a)">
  <mo>|</mo>
    <mtable arg="a">
      <mtr>
        <mtd><mi>๐‘Ž</mi></mtd><mtd><mi>๐‘</mi></mtd></mtr>
      <mtr><mtd><mi>๐‘</mi></mtd><mtd><mi>๐‘‘</mi></mtd></mtr></mtable>
  <mo>|</mo></mrow>.

The program infers intent attributes for absolute value and determinant, so only cardinality needs to be input without vertical bars. Note that the ambiguous expression |๐‘Ž|๐‘+๐‘|๐‘‘| is assumed to be (|๐‘Ž|)๐‘+๐‘(|๐‘‘|). If you want |๐‘Ž(|๐‘+๐‘|)๐‘‘|, enter |(๐‘Ž|๐‘+๐‘|๐‘‘)| and the parentheses will be removed.

As we see here, some intent attribute values are implied by the input notations of LaTeX and UnicodeMath. Others are implied by context. Still others must be declared explicitly by the content author, by a math-knowledgeable copy editor, or maybe eventually by AI.

Author intents

Since most content authors donโ€™t know MathML, we need a way to allow them to enter intents easily. To this end, UnicodeMathML has an output-window context-menu option that lets you tag entities with intents. For example, right-clicking on the ๐ธ in ๐ธ = ๐‘š๐‘ยฒ, you get the input box

and you can type in โ€œenergyโ€ or whatever you want followed by the Enter key. If you type in โ€œenergyโ€, the resulting MathML is

<mrow>
  <mi intent="energy">๐ธ</mi>
  <mo>=</mo>
  <mrow>
    <mi>๐‘š</mi>
    <msup><mi>๐‘</mi>
    <mn>2</mn></msup></mrow></mrow>

Typing Atl+d speaks this as "energy equals m c squared".

TeX macros

You can use [La]TeX macros with [La]TeX input. Simple examples are:

Macro Use Result
\def\f{x_1+...+x_n} \f ๐‘ฅโ‚+โ‹ฏ+๐‘ฅ_๐‘›
\def\g#1#2{#1+#2} \g ab ๐‘Ž + ๐‘

The last equation in the Examples gallery is LaTeX that defines a macro and then uses it:

\[\def\g#1#2{#1f(#2)}\g\relax{x}=\int_{-\infty}^\infty \g\hat\xi,e^{2 \pi i \xi x} ,d\xi\]

This displays as

๐‘“(๐‘ฅ)=โˆซโˆ’โˆžโˆž๐‘“ฬ‚(๐œ‰)๐‘’2๐œ‹๐‘–๐œ‰๐‘ฅ๐‘‘๐œ‰

LaTeX \newcommand syntax is also supported.

UnicodeMath selection attributes

Technical stuff: When you edit the output window, the resulting MathML includes attributes that represent the state of the user selection. These attributes have been added partly because they are needed to make editing accessible. The attribute "selanchor" defines the selection "anchor" end (the nonmoving end) and "selfocus" defines the selection active end, e.g., the end that moves with Shift+โ†’. The attribute values define the offsets for the selection setBaseAndExtent method. If the selection is an insertion point (a degenerate selection), only selanchor is included since the anchor and focus ends coincide.

Corresponding constructs have been added to UnicodeMath to represent the selection state. They are needed for the multilevel undo facility, which saves back states by caching the back-state UnicodeMath strings. The enclosure โ’ถ(offset) defines the position of the selection anchor and the enclosure โ’ป(offset) defines the position of the selection focus. If no offset appears, 0 is assumed. To increase readability, these enclosures are not included in the UnicodeMath displayed in the input window. Nondegenerate selections have the focus enclosure as well, as in the UnicodeMath "โ’ถ()โ’ป(1)โฌš" for the selected "โฌš".

A negative offset is used if the selection construct refers to a text node. The absolute value of a negative offset gives the offset into a string. For example, <mi selanchor="-1">sin</mi> sets the anchor to the "i" in "sin". Positive attribute values give the index of a child element. So, <mi selanchor="1">sin</mi> places the anchor immediately following "sin".