Page MenuHomePhabricator

Make MMLbase support trees
Closed, ResolvedPublic

Description

Goal: Extend the class \MediaWiki\Extension\Math\WikiTexVC\MMLnodes\MMLbase to support tree structures.

The subclasses of MMLbase should follow the MathML core spec, and only the part of the spec used should be implemented. For example, the attribute dir should not be accepted since it can not be generated by wikitexvc.

In MathML, there are two types of elements: Token Elements, which don't have children, and others that don't have text. Therefore, I suggest adding a new abstract base class, MMLLeaf, to be used for all MathML elements with text content.

MMLbase and its subclasses ensure that instead of dealing with an arbitrary string, one can rely that the structure one works with is a valid MathML tree according to the MathML core spec. This fulfills two functions 1) validation that only standard conform MathML core is generated and 2) that subtree matching is possible due to a normalized representation of the expression.

Suggested implementation path:

Step 0 (optional) rename classes ✅

As we will be working a lot with the MMLm* classes, I would prefer omitting the MML prefix in the class name. Especially since the classes are in a dedicated namespace and mainly start with m, it makes the readability better. However, this is just a question of taste. Moreover, now is a historic moment where almost all code reviews are complete, and this refactoring would not cause merge conflicts.

  • Agreement on MML prefix achieved

--> resolution: no we will not rename the classes and keep the MML prefix (for now)

Step 1a: Mark leaves as such ✅

A first commit can be a new class MMLleaf with text as an additional (optional) constructor argument. The Class should extend MMLbase, and the following classes should be changed to be subclasses of that class:

  • MMLmi
  • MMLmo
  • MMLmn
  • MMLms (if ever used)
  • MMLmtext

In the same commit, absolutely minimal unit tests should be added in tests/phpunit/unit/WikiTexVC/MMLNodes for

  • MMLleaf
  • MMLmi
  • MMLmo
  • MMLmn
  • MMLms (if ever used)
  • MMLmtext

The tests should only test if the constructor works and if they return the correct value for name(). The overall idea would be to phase out the encapsulate start etc. Thus these should not be changed or tested.

Step 1b: Deprecate non-core MathML elements ✅

We should deprecate all classes not in the official list. According to my analysis this is

  • menclose (most uses should already be removed per T377167
  • double check if more classes are not in core.

Step 1c: Implement an MML converter service skeleton ✅

Ultimately, MMLbase should be stringable. and have an interface to add links based on a list of other MMLbase nodes. Possible method signatures read

public __toString(): string
public annotateSubtrees(MMLbase ... $subtrees) (in a later step)

However, as there are various ways to implement it, I think it would be better to transform it into a service so that dependencies can be injected the usual way.

Thus, I suggest building an MMLvisitor service that fulfills these two functions. In the first step, one can follow the manual and the example of the input check service.

The first commit aims to extend ServiceWiring.php, extension.json, and src/Math.php for the new Math.MathMLTreeVisitor service and new MediaWiki\Extension\Math\WikiTexVC\MMLnodes\VisitorFactory and abstract baseVisitor and respective tests.

Step 2 Implement MMLDomVisitor ✅

The second step is to implement the __toString functionality by converting MMLbase elements to DOMElements and attach it to a DOMDocument and use saveXML to get the XML representations.

Here we start with supporting only MMLleave elements and MMLelements with no children.

In a nutshell

$dom = new DOMDocument();
$element = $dom->createElement($leaf->getName());
$element->appendChild($dom->createTextNode(($leaf->getName())));
$dom->appendChild($element);
$dom->saveHTML();

Step 3a Implement tree structure ✅

  • Familiarize yourself with the constructors of TexNode, TexArray, DQ and FQ
    • the function of the protected variable args is similar to the concept of children. However, for MMLbase the children should always be of type MMLbase never strings (they are wrapped into MMLleaf elements)
    • The TexNode constructors ensure that only valid trees can be built. For MMLbase this is currently not the case and we had bugs in the past where we had fractions and subscripts with more than two arguments.
    • By default, there is no way to add or remove arguments. An exception is TexArray which has push, pop, unshift... This architecture has proven to be successful in the past. The fact that the args variable is protected and not private allows the tree to be modified. However, implementing only specific modifications safeguards against situations where you end up with invalid trees.

After having reviewed this, we might want to discuss the following idea to represent children in the MML tree:

  • The MMLbase node
    • gets a protected array of type MMLbase that stores the children
    • the constructor of MMLbase is extended by a variable length argument of children
    • a method to get the children is added
  • The non-leaf MML nodes (this can also be a second commit if the change would get too large)
    • get additional constructor arguments according to the Core spec
  • Tests are added to check the above functionality

Step 3b Implement __toString for leaves ✅

Use the domVisitor implemented in step 1c for the __toString method of the MMLleaf.

Check that results match the respective encapsulate (not encapsulate raw) functions of the leaf nodes.

Compare the performance of __toString and encapsulateraw and document the results, for example in the doc folder.

Step 4.a Extend the DOMVistor to support MMLbase elemet ✅

In Step 2, the DomVisitor only supported MMLLeaf elements. Now, the DomVistor should be extended to handle arbitrary MMLbase elements.

Additional logic is needed to add children, and more testing is needed.

Step 4.b Replace usages for encapsulate with __toString ✅

Review usages of the encapsulate function

Replace inappropriate use with other methods (extra commit):

  • checkAndParseColor generates an empty mstyle element. I wonder if that method is actively used. --> Replace with encapsulate raw
  • macro mrow should become merror
  • Replace remaining usages of encapsulate by adding the argument to the constructor and casting to a string
  • remove encapsulate method and tests.

Step 4.c Replace usages for encapsulateRaw when called on MMLleaf nodes with __toString for leaf nodes ✅

The following MMLleaf nodes use the encapsulateRaw method

Usages in All Places  (58 usages found)
    Method call  (58 usages found)
        mw  (58 usages found)
            BaseMethods.php  (10 usages found)
                BaseMethods  (10 usages found)
                    checkAndParseColor  (1 usage found)
                        229 $innerRow .= $mi->encapsulateRaw( $char );
                    checkAndParseDelimiter  (1 usage found)
                        190 return $mo->encapsulateRaw( $resDelimiter[0] );
                    checkAndParseMathCharacter  (1 usage found)
                        204 return $mi->encapsulateRaw( $enc );
                    checkAndParseOperator  (1 usage found)
                        72 return $mmlMo->encapsulateRaw( $input );
                    parseIdentifier  (1 usage found)
                        166 $text = $mi->encapsulateRaw( $uc );
                    parseOperator  (2 usages found)
                        123 $text = $mo->encapsulateRaw( $uc . "̸" );
                        125 $text = $mo->encapsulateRaw( $uc );
                    parseOperatorDict  (3 usages found)
                        101 return $mmlMo->encapsulateRaw( "<" );
                        104 return $mmlMo->encapsulateRaw( ">" );
                        112 return $mmlMo->encapsulateRaw( $input );
            BaseParsing.php  (35 usages found)
                BaseParsing  (35 usages found)
                    accent  (1 usage found)
                        82 $mo->encapsulateRaw( $entity )
                    array  (2 usages found)
                        99 $output .= $moOpen->encapsulateRaw( $resDelimiter[0] );
                        116 $output .= $moClose->encapsulateRaw( $resDelimiter[0] );
                    customLetters  (2 usages found)
                        193 return $mrow->encapsulateRaw( $mo->encapsulateRaw( $char ) );
                        197 return $mrow->encapsulateRaw( $mi->encapsulateRaw( $char ) );
                    dots  (1 usage found)
                        224 return $mo->encapsulateRaw( "…" );
                    genFrac  (2 usages found)
                        294 $output .= $mrowOpen->encapsulateRaw( $moL->encapsulateRaw( $left ) );
                        301 $output .= $mrowClose->encapsulateRaw( $moR->encapsulateRaw( $right ) );
                    hBox  (5 usages found)
                        1157 return $mmlMrow->encapsulateRaw( $mo->encapsulateRaw( MMLutil::uc2xNotation( $input ) ) );
                        1161 return $mmlMrow->encapsulateRaw( $mtext->encapsulateRaw( "\mbox" ) );
                        1168 return $mmlMrow->encapsulateRaw( $mstyle->encapsulateRaw( $mtext->encapsulateRaw( $inner ) ) );
                        1173 return $mmlMrow->encapsulateRaw( $mtext->encapsulateRaw( $inner ) );
                        1190 return $mmlMrow->encapsulateRaw( $mtext->encapsulateRaw( $inner ) );
                    hskip  (1 usage found)
                        358 return $mspace->encapsulateRaw( "" );
                    macro  (10 usages found)
                        394 return $mtext->encapsulateRaw( ' ' );
                        429 $mmlMi->encapsulateRaw( "lim" ) . $mo->encapsulateRaw( "―" ) ) );
                        437 $mi->encapsulateRaw( "lim" ) .
                        438 $mo->encapsulateRaw( "→" ) )
                        485 return $mstyle->encapsulateRaw( $mspace->getEmpty() ) . $mo->encapsulateRaw( "⟹" ) .
                        491 return $mstyle->encapsulateRaw( $mspace->getEmpty() ) . $mo->encapsulateRaw( "⟺" ) .
                        496 return $mo->encapsulateRaw( "—" );
                        516 return $mtext->encapsulateRaw( " " ) .
                        520 $mo->encapsulateRaw( "⟵" ) ) ) .
                        523 $mo->encapsulateRaw( "⟶" )
                    makeBig  (1 usage found)
                        917 return $mrowOuter->encapsulateRaw( $mrow->encapsulateRaw( $mo->encapsulateRaw( $argPrep ) ) );
                    matrix  (2 usages found)
                        592 $mmlMoOpen = $mmlMoOpen->encapsulateRaw( $open ?? '' );
                        600 $mmlMoClose = $mmlMoClose->encapsulateRaw( $close );
                    namedFn  (1 usage found)
                        933 return $mi->encapsulateRaw( ltrim( $name, '\\' ) ) . $applyFct;
                    namedOp  (1 usage found)
                        615 return $mi->encapsulateRaw( $id ?? ltrim( $name, '\\' ) ) . $applyFct;
                    oint  (3 usages found)
                        672 return $mStyle->encapsulateRaw( $mo->encapsulateRaw( MMLutil::uc2xNotation( $uc ) ) );
                        675 return $mo->encapsulateRaw( MMLutil::uc2xNotation( $uc ) );
                        683 $mmlText->encapsulateRaw( MMLutil::uc2xNotation( $uc ) )
                    underOver  (1 usage found)
                        799 $mo->encapsulateRaw( $inner )
                    xArrow  (2 usages found)
                        1284 $mstyle->encapsulateRaw( $moArrow->encapsulateRaw( $char ) ) .
                        1298 $mstyle->encapsulateRaw( $moArrow->encapsulateRaw( $char ) ) .
            Box.php  (1 usage found)
                Box  (1 usage found)
                    renderMML  (1 usage found)
                        64 $mtext->encapsulateRaw( $arg )
            ChemWord.php  (2 usages found)
                ChemWord  (2 usages found)
                    renderMML  (2 usages found)
                        51 $mtextLeft->encapsulateRaw( $this->getLeft()->renderMML( [], $state ) )
                        52 . $mtextRight->encapsulateRaw( $right ) ) );
            Fun1.php  (1 usage found)
                Fun1  (1 usage found)
                    createMover  (1 usage found)
                        65 $mo->encapsulateRaw( $inner )
            Literal.php  (4 usages found)
                Literal  (4 usages found)
                    createVlineElement  (1 usage found)
                        209 $mStyle->encapsulateRaw( $mo->encapsulateRaw( "|" ) ) ) );
                    renderMML  (3 usages found)
                        75 return $mn->encapsulateRaw( $this->changeUnicodeFontInput( $this->arg, $state ) );
                        91 return $mi->encapsulateRaw( $operatorContent["foundOC"] );
                        143 return $mi->encapsulateRaw( $this->changeUnicodeFontInput( $input, $state ) ); // $this->arg
            Lr.php  (2 usages found)
                Lr  (2 usages found)
                    renderMML  (2 usages found)
                        72 $left = $moLeft->encapsulateRaw( $this->right );
                        78 $right = $moRight->encapsulateRaw( $this->right );
            MMLParsingUtil.php  (2 usages found)
                MMLParsingUtil  (2 usages found)
                    createNot  (1 usage found)
                        109 return $mmlMrow->encapsulateRaw( $mpadded->encapsulateRaw( $mtext->encapsulateRaw( "⧸" ) ) );
                    renderApplyFunction  (1 usage found)
                        21 return $mo->encapsulateRaw( "⁡" );
            TexArray.php  (1 usage found)
                TexArray  (1 usage found)
                    addDerivativesContext  (1 usage found)
                        381 $mml = $msup->encapsulateRaw( $mml . $moDeriv->encapsulateRaw( $derInfo ) );

replace patterns like

				$mmlMo = new MMLmo();
				return $mmlMo->encapsulateRaw( "<" )

by return (string) (new MMLMo( '',',','<');
If > is a variable $x use html-entity-decode to convert to normal text. So the above example would become
return (string) (new MMLMo( '',',',html_entity_decode($x));

Step 4d: Replace usages of MMLbase::getEmpty. ✅

We have to replace MMLbase::getEmpty. This is only called for MMLmspace or MMLmrow and returns the empty Element <mspace\>. With our current implementation, we get the full tag just without an inner text: <mspace><\mspace>. <del>Either we change the tests,</del> or we change MMLDomVisitor::getHTML from $this->dom->saveHTML( $this->dom->documentElement ) to $this->dom->saveXML( $this->dom->documentElement, LIBXML_NOEMPTYTAG ). That should automatically close all empty elements while keeping everything else the same.

Step 5 Replace remaining usages of encapsulateRaw ✅

Replace the remaining usages of encapsulateRaw class by class and make one commit for each class (including test classes).

Try to do the string conversion as late as possible in each function.

For private methods try to change the return type from string to MMLbase.

We might want to create a new function in TexNode that will eventually replace renderMML, which we might call toMMLTree, and has return type MMLbase instead of string.

Step 5.a Make MMLDomVisitor temporarily accept strings as children

Currently, we have TexNode, which calls BaseMethods::checkAndParse, which then calls BaseParsing::..., which then also calls TexNode. Changing one to only use MMLBase breaks the other functions.
We can parse XML strings directly to DOM with appendXML, so that we don't have to rewrite everything at once.

Step 5.b Make BaseParsing functions return MMLbase ✅

Now that the DOMVisitor also accepts strings, we can rewrite all functions of BaseParsing.php to only return MMLbase. All calls of renderMML() can be directly parsed as children of MMLbase.

Note: That MMLbase does not represent the structure of the DOM as some parts of the tree are still packed into strings.

Step 5.c Make BaseMethods functions return MMLbase ✅

Replace remaining usages of encapsulateRaw and the string casting with MMLbase.

Step 5.d Make TexNode support MMLbase

We might want to create a new function in TexNode that will eventually replace TexNode::renderMML, which we might call `toMMLTree, and has return type MMLbase instead of string.

Step 5.e Replace remaining usages of encapsulateRaw

The class`MMLParsingUtil.php` should be the last class with encapsulateRaw

Step 5.f Remove encapsulateRaw

Remove encapsulateRaw, getStart, getEnd, getEmpty and the string support of MMLDomVisitor

Step x Implement subtree matching

  1. to be continued

Details

Other Assignee
FrederikHennecke1
Related Changes in Gerrit:
SubjectRepoBranchLines +/-
mediawiki/extensions/Mathmaster+278 -13
mediawiki/extensions/Mathmaster+34 -119
mediawiki/extensions/Mathmaster+39 -42
mediawiki/extensions/Mathmaster+62 -59
mediawiki/extensions/Mathmaster+102 -81
mediawiki/extensions/Mathmaster+67 -56
mediawiki/extensions/Mathmaster+44 -89
mediawiki/extensions/Mathmaster+50 -43
mediawiki/extensions/Mathmaster+135 -27
mediawiki/extensions/Mathmaster+54 -54
mediawiki/extensions/Mathmaster+46 -79
mediawiki/extensions/Mathmaster+182 -242
mediawiki/extensions/Mathmaster+172 -153
mediawiki/extensions/Mathmaster+33 -10
mediawiki/extensions/Mathmaster+196 -62
mediawiki/extensions/Mathmaster+31 -52
mediawiki/extensions/Mathmaster+58 -86
mediawiki/extensions/Mathmaster+56 -59
mediawiki/extensions/Mathmaster+0 -18
mediawiki/extensions/Mathmaster+162 -36
mediawiki/extensions/Mathmaster+245 -13
mediawiki/extensions/Mathmaster+4 -3
mediawiki/extensions/Mathmaster+57 -4
mediawiki/extensions/Mathmaster+979 -23
mediawiki/extensions/Mathmaster+0 -15
mediawiki/extensions/Mathmaster+164 -6
mediawiki/extensions/Mathmaster+109 -0
mediawiki/extensions/Mathmaster+1 -0
mediawiki/extensions/Mathmaster+155 -12
mediawiki/extensions/Mathmaster+2 -2
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change #1133235 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step3a

https://gerrit.wikimedia.org/r/1133235

Change #1133235 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step3a

https://gerrit.wikimedia.org/r/1133235

I noticed some performance issues with the new MMLDomVisitor implementation.
I merely tested the old .encapsulate function against the new toString implemenation. I also wrote several other classes to test different XML/HTML implemntations.
Here are the times for 10,000 repetitions:

Performance: Legacy 0.030082941055298s
Performance: DOM 0.11080002784729s
Performance: HTML 0.049198141098022s
Performance: xmlWriter 0.075751066207886s
Performance: simpleXML 0.13390803337097s

The HTML class uses the Mediawiki HTML::element function just like the legacy encapsulate function.

The performance decrease comes from the overhead of calling the setVisitorfactory and the getVisitorFactory. But the performance decrease shouldnt be as bad when we rewrite everything to use the new toString function, as the visitor is only generated by the root node.

I propose a to rewrite the MMLDomVisitor to use the Mediawiki HTML class instead.

Change #1137286 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Improve Visitor implementation

https://gerrit.wikimedia.org/r/1137286

I was able to reproduce the values locally

Performance: Legacy 0.017924785614014s
Performance: DOM 0.063331127166748s
Performance: HTML 0.055146932601929s
Performance: xmlWriter 0.050059080123901s
Performance: simpleXML 0.069455862045288s

I was using a profiler, and it seems that (35%) of the time is used in mocking. I'm trying to rewrite the test so that only the actual methods are tested.

With 100 000 iterations

Screenshot 2025-04-18 at 10.55.23.png (444×1 px, 94 KB)

I see only a minor difference between DOM and HTML. Given that our final Goal is to work on the DOM for the complex matching, I suggest continuing with DOM for now and postponing the final evaluation to a later stage and test with real workloads.

I did rewrite the tests to run without mocks in this style

		$this->expectNotToPerformAssertions();
		$visitor = new MMLDomVisitor();
		$mo2 = new MMLmo( "", [ 'mathvariant' => 'bold' ], 'x' );
		for ( $i = 0; $i < self::REPETITIONS; $i++ ) {
			$mo2->accept( $visitor );
			$visitor->getHTML();
		}

I reverted to 10 000 iterations and obtained the following results

Screenshot 2025-04-18 at 11.21.15.png (444×1 px, 108 KB)

Profiling reveals that 97.7% of the time is spent in \DOMDocument::saveHTML. Now, I'm wondering if the previous tests actually called \MediaWiki\Extension\Math\WikiTexVC\MMLnodes\MMLDomVisitor::getHTML

Change #1137431 had a related patch set uploaded (by Physikerwelt; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] WIP: DOM performance test physikerwelt

https://gerrit.wikimedia.org/r/1137431

Ah the constructor of the tree must be within the loop.

		$mo2 = new MMLmo( "", [ 'mathvariant' => 'bold' ], 'x' );
		for ( $i = 0; $i < self::REPETITIONS; $i++ ) {
			$visitor = new MMLDomVisitor();
			$mo2->accept( $visitor );
			$visitor->getHTML();
		}

leads to the following results (100 000 iterations)

Screenshot 2025-04-18 at 11.41.06.png (444×1 px, 108 KB)

Looking into the profiling results:

Screenshot 2025-04-18 at 11.54.05.png (972×2 px, 640 KB)

It seems to suggest that the performance differences might level out if run on real data. Thus, I would like to support my previous statement that we should continue with the DOMVisitor.

Ok, then we can keep supporting the DOMVisitor. We can still change the visitor later on if needed.

Change #1138410 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.a Extend the DOMVistor to support MMLbase elemet

https://gerrit.wikimedia.org/r/1138410

Change #1138410 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.a Extend the DOMVistor to support MMLbase elemet

https://gerrit.wikimedia.org/r/1138410

Change #1140532 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.b Replace usages for encapsulate with __toString

https://gerrit.wikimedia.org/r/1140532

Change #1140532 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.b Replace usages for encapsulate with __toString

https://gerrit.wikimedia.org/r/1140532

Change #1140793 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.b, fix for MMLmerror

https://gerrit.wikimedia.org/r/1140793

Change #1140793 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.b, fix for MMLmerror

https://gerrit.wikimedia.org/r/1140793

Change #1141059 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.b, remove encapsulate method and tests

https://gerrit.wikimedia.org/r/1141059

Change #1137431 abandoned by Physikerwelt:

[mediawiki/extensions/Math@master] WIP: DOM performance test physikerwelt

Reason:

just performance test

https://gerrit.wikimedia.org/r/1137431

Change #1141059 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.b, remove encapsulate method and tests

https://gerrit.wikimedia.org/r/1141059

For Task 4.c, I have the following points:

  1. Keep
if ( $node instanceof MMLleaf ) {
	$textNode = $this->dom->createTextNode( $node->getText() );
	$element->appendChild( $textNode );
	return;
}

instead of

if ( $node instanceof MMLleaf ) {
	$element->nodeValue=$node->getText()
	return;
}

TextNode already escapes every special char as seen in MMLDomVisitorTest. The $element->nodeValue code gives me an Error: "unterminated entity reference".

  1. Use return (string) (new MMLMo( '',[],'<'); instead of
$mmlMo = new MMLmo();
return $mmlMo->encapsulateRaw( "&lt;" )

Maybe we could also split this into two commits, we first use return (string) (new MMLMo( '',',',html_entity_decode("&lt;)) and translate the special char in a second commit to make the commits more easily readable.

The $element->nodeValue code gives me an Error: "unterminated entity reference".

For which input? Maybe you omitted the last ;?

Anyhow. It seems that we agree that we want to end up with up to 4-byte-long Unicode characters in the source code. Since modern IDEs support this, I don't think editing that source code is a problem, but we need to test this, as most code does not have 4-byte chars.

I would not worry too much about how many commits to make. Maybe just start with something that you can achieve in one programming session.

I think we should start with the simple cases like

return (string) (new MMLMo( '',[],'<');

and if the thing to be escaped comes from variable $x

$str = html_entity_decode($x);
return (string) (new MMLMo( '',[],$str);

one extra commit can be to rewrite the json file and the utility functions that read from that file. This could also be the first step,

Physikerwelt updated the task description. (Show Details)

Change #1143973 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4.c Replace usages for encapsulateRaw when called on MMLleaf nodes with __toString for leaf nodes. Changes in BaseMethods.php

https://gerrit.wikimedia.org/r/1143973

Change #1143973 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Replace MMLleaf::encapsulateRaw nodes with __toString in BaseMethods.php

https://gerrit.wikimedia.org/r/1143973

I just noticed that one step is missing: we have to replace MMLbase::getEmpty. This is only called for MMLmspace or MMLmrow and returns the empty Element <mspace\>. With our current implementation, we get the full tag just without an inner text: <mspace><\mspace>. Either we change the tests, or we change MMLDomVisitor::getHTML from $this->dom->saveHTML( $this->dom->documentElement ) to $this->dom->saveXML( $this->dom->documentElement, LIBXML_NOEMPTYTAG ). That should automatically close all empty elements while keeping everything else the same.

Change #1148989 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4c

https://gerrit.wikimedia.org/r/1148989

Change #1148989 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4c

https://gerrit.wikimedia.org/r/1148989

Change #1149448 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4d

https://gerrit.wikimedia.org/r/1149448

Change #1149448 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 4d

https://gerrit.wikimedia.org/r/1149448

I propose the following steps for Step 5:

Step 5 Replace remaining usages of encapsulateRaw

Step 5.a Make MMLDomVisitor temporarily accept strings as children

Currently, we have TexNode, which calls BaseMethods::checkAndParse, which then calls BaseParsing::..., which then also calls TexNode. Changing one to only use MMLBase breaks the other functions.
We can parse XML strings directly to DOM with appendXML, so that we don't have to rewrite everything at once.

Step 5.b Make BaseParsing functions return MMLbase

Now that the DOMVisitor also accepts strings, we can rewrite all functions of BaseParsing.php to only return MMLbase. All calls of renderMML() can be directly parsed as children of MMLbase.

Step 5.c Make BaseMethods functions return MMLbase

Replace remaining usages of encapsulateRaw and the string casting with MMLbase.

Step 5.d Make TexNode support MMLbase

We might want to create a new function in TexNode that will eventually replace TexNode::renderMML, which we might call `toMMLTree, and has return type MMLbase instead of string.

Step 5.e Replace remaining usages of encapsulateRaw

The class`MMLParsingUtil.php` should be the last class with encapsulateRaw

Step 5.f Remove encapsulateRaw

Remove encapsulateRaw, getStart, getEnd, getEmpty and the string support of MMLDomVisitor

Change #1149696 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] [WIP]Make MMLbase support trees: Step 5.a

https://gerrit.wikimedia.org/r/1149696

Change #1149743 had a related patch set uploaded (by Physikerwelt; author: Physikerwelt):

[mediawiki/extensions/Math@master] Support tree generation for literal numbers

https://gerrit.wikimedia.org/r/1149743

Change #1149696 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.a

https://gerrit.wikimedia.org/r/1149696

Change #1149743 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Support tree generation for literal numbers

https://gerrit.wikimedia.org/r/1149743

Change #1152373 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.b

https://gerrit.wikimedia.org/r/1152373

Change #1152373 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.b

https://gerrit.wikimedia.org/r/1152373

Change #1154275 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.b

https://gerrit.wikimedia.org/r/1154275

Change #1154275 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.b

https://gerrit.wikimedia.org/r/1154275

Change #1154412 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.b

https://gerrit.wikimedia.org/r/1154412

Change #1154412 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.b

https://gerrit.wikimedia.org/r/1154412

Change #1158443 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.c

https://gerrit.wikimedia.org/r/1158443

Change #1158443 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.c

https://gerrit.wikimedia.org/r/1158443

Change #1165906 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Add array wrapper for MMLbase

https://gerrit.wikimedia.org/r/1165906

Change #1165906 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Add array wrapper for MMLbase

https://gerrit.wikimedia.org/r/1165906

Change #1166524 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.d

https://gerrit.wikimedia.org/r/1166524

Change #1166524 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.d

https://gerrit.wikimedia.org/r/1166524

Change #1167875 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.d

https://gerrit.wikimedia.org/r/1167875

Change #1167875 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.d

https://gerrit.wikimedia.org/r/1167875

Change #1168223 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.d

https://gerrit.wikimedia.org/r/1168223

Change #1168223 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Step 5.d

https://gerrit.wikimedia.org/r/1168223

Change #1170446 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Changed usage of TexNode::renderMML to TexNode::toMMLTree in BaseParsing.php

https://gerrit.wikimedia.org/r/1170446

Change #1170446 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Replace TexNode::renderMML with TexNode::toMMLTree in BaseParsing.php

https://gerrit.wikimedia.org/r/1170446

I just updated the progress marks...
@FrederikHennecke1 is 5d complete?

Change #1171231 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Replace TexNode::renderMML with TexNode::toMMLTree in WikiTex Nodes and in tests

https://gerrit.wikimedia.org/r/1171231

I just updated the progress marks...
@FrederikHennecke1 is 5d complete?

It's almost complete. Just uploaded the last patch for 5d.

Change #1171231 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Replace TexNode::renderMML with TexNode::toMMLTree in WikiTex Nodes and in tests

https://gerrit.wikimedia.org/r/1171231

Change #1172868 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Replaced remaining usages of MMLbase::encapsulateRaw

https://gerrit.wikimedia.org/r/1172868

Change #1172868 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Replace remaining usages of MMLbase::encapsulateRaw

https://gerrit.wikimedia.org/r/1172868

Change #1175223 had a related patch set uploaded (by FrederikHennecke1; author: FrederikHennecke1):

[mediawiki/extensions/Math@master] Remove encapsulateRaw and renderMML

https://gerrit.wikimedia.org/r/1175223

Change #1175223 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Remove encapsulateRaw and renderMML

https://gerrit.wikimedia.org/r/1175223

Physikerwelt updated the task description. (Show Details)
Physikerwelt updated Other Assignee, added: FrederikHennecke1.

Is there a rationale for not having a MMLleaf::setText method, or is it an oversight ? This forbid post-processing of the produced tree, for example.

It was not needed up to now I think. Maybe there would be a performance gain if we used a read onlyclass https://en.wikipedia.org/wiki/Immutable_object?wprov=sfti1#PHP so you can introduce it if needed but it is a bit nicer to avoid having that

Change #1137286 abandoned by Physikerwelt:

[mediawiki/extensions/Math@master] Make MMLbase support trees: Improve Visitor implementation

https://gerrit.wikimedia.org/r/1137286