Saturday, 20 June 2009

LaTeX and ASCIIMathML to Content MathML and Maxima syntax

One of the slightly experimental new features included in SnuggleTeX 1.1.0 is the ability to attempt to convert a limited but hopefully useful subset of math mode LaTeX into "more semantic" formats, such as Content MathML and Maxima input syntax. (This release of SnuggleTeX also includes some equally experimental support for trying to do the same thing with the raw output produced by ASCIIMathML.)

The context of this work was the JISC MathAssess project, where we looked at the feasibility of allowing students of "foundation level" mathematics to input mathematics into computer-aided assessment software using "lax" input syntaxes such as LaTeX or ASCIIMathML. This idea was considered as a possible alternative to using Excel-like formats, or requiring students to learn the syntax for Computer Algebra Systems such as Maxima, Maple or Mathematica.

In general, this "up-conversion" approach - going from "low semantics" such as LaTeX to "higher semantics" such as Content MathML - is not possible. Why not? Well, you don't actually have to look far to see why! Consider the mathematical symbol e. In some contexts, this might represent the exponential number 2.718... but it might also represent the identity element in a group, or some physical quantity. So context is clearly important! Another very trivial example is to compare the written mathematical expressions f(x+2) and a(x+2). To someone who has studied any mathematics, the first of these will probably make them think of the function f applied at x+2, whereas the latter will probably be considered as the product of a and x+2. So, again, the underlying context is vitally important but can sometimes be inferred by following and assuming certain conventions. (This is however complicated by the fact that mathematical notations are localised, so notations common in the UK are not necessarily common anywhere else!)

The approach we take is to look at only a very restricted subset of symbols and constructs, using conventions that are considered common, sensible and familiar in the UK which, in fact, covers a pretty reasonable spectrum of the mathematical contexts that we're aiming at. From this base, it is possible to convert the simple, display-oriented Presentation MathML we expect to get from SnuggleTeX and ASCIIMathML into a more semantic Presentation MathML representation that renders the same way, before converting this to Content MathML and then finally into other formats such as Maxima.

More details on the mechanics of this process can be found in the SnuggleTeX documentation under Semantic Up-Conversion. Techy folks interested in the actual implementation might want to know that it's all done using XSLT 2.0, which is well suited to these types of conversions and is an absolute joy to use. You're welcome to rip off our XSLT and perhaps use it as a basis for similar processes, if useful. It's all in the "full" ZIP distribution of SnuggleTeX. Feel free to ask if you want more information...

SnuggleTeX 1.1.0 Released

We are pleased to announce the release of SnuggleTeX 1.1.0, which is now available to download, read about and mess around with online.

So what's new in this release? Well the most significant new feature is (some still somewhat experimental) support for "up-converting" certain LaTeX math inputs to more semantic formats than the default Presentation MathML 2.0 outputs, such as Content MathML 2.0 and Maxima input syntax. This release also includes support for doing the same thing to the raw Presentation MathML generated by ASCIIMathML. This work was undertaken as part of my involvement with the JISC MathAssess project but is something that other people might find useful in other settings. I talk about this a bit more in a later Blog posting...

As well as this, there have been a number of other enhancements, including a new Maven-based modular project structure, better distribution bundles (a "basic" ZIP if you just want the core functionality, "full" if you want everything), extra utilities and helpers, a simple way of generating web pages, better documentation and various other bug fixes. I've also added some new demos for you to play around with - try this "Simple Math Input Demo" to get you started! For full details of what's new, check out the Release Notes.