Implement Math Accessibility Features necessary for Intent attribute usage
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	Stegmujo
	Jan 16 2023, 4:45 PM

Description

This task consists of four bigger sections. The tasks are ordered by their order of implementation.
For readability, some of the bigger tasks contain a written overall description.
This task is for discussion, clarification and estimation of efforts.

(I) For MathML in Math extension MediaWiki/Wikipedia:

https://phabricator.wikimedia.org/T310211 : Make 'only MathML rendering' interface available on MediaWiki/MediaWiki as a setting (without SVG rendering) Estimate 2
https://phabricator.wikimedia.org/T327386 : Create a Full Coverage Test for TexVC(PHP)
https://phabricator.wikimedia.org/T329620 : Fix Chem support, to be clarified: partly covered atm when test MMLGenerationTexUtilTest.php is active
https://phabricator.wikimedia.org/T327388 : Generate and Update reference MathML (from Mathoid/LaTeXML) for the current tests
https://phabricator.wikimedia.org/T327391 : Fix TexVC parsetree related issues
https://phabricator.wikimedia.org/T331998 : Fix further issues detected during visual comparisons
https://phabricator.wikimedia.org/T327393 : Implement a comparison algorithm which can compare different MathML outputs for automated testing
https://phabricator.wikimedia.org/T328752 Implement a comparison algorithm which can compare different MathML for scienctifc resulst (to be clarified)
https://phabricator.wikimedia.org/T327392 : Implement full coverage MathML parsing and mappings

Mini:

fix Exception throwing by LocalChecker to $this->valid = false and warning or error (done)
some class-based unittests for the base mml classes

(II) For Wikidata / Wikipedia Semantics Import:
AnnoMaTex is used for Formula Annotation.
Formula Concepts are integrated in LaTeX of formula: <math display="block" qid=Q35875>E=m\,c^2</math> in Wikitext source code.
Each of the constants (E,m,c) can point to another wikidata concept (by the ‘defining formula’ property (P 2534)).
The in-depth info for an annotated formula (in example above) can be fetched by retrieving a Wiki-special page which holds all the annotations. (Edit: This is an example of the special-page looks like in a wiki ??)The Wikidata item the qid points to, holds MathML which contains the intent annotations.

In a nutshell: it is the qid which is added to a formula on a Wikipage determines the Intents, these are located as annotations of the MathML on the Wikidata page the qid points to.
The feature for annotating Wikidata pages of formulas with intents has yet to be developed. //

Create a list of the already available annotated formula with wikidata qids, which can be used for our tests. (optional)
Develop the feature for annotating Wikidata items with intents (is there a validation necessary here? or some type of GUI based forms?) (maybe a follow-up project for the far future)
Annotate the wikidata elements to which the list points to with Intent annotations within Wikidata / Wikibase . (This can be done by annomathtex, however I don't think it relates to this project)

I think in section (II) there is no work to be done.

(III) Math extension composing the final MathML for Screenreaders :

Intents for formulas are not generated by the Math extension itself. It gets the annotation of from Wikidata, see (II).
If formulas have qid-semantic-annotations with annotated intents in corresponding Wikidata items, the formulas are considered as non-default configuration.
For these formulas the intent attributes get read from a Wikipage(Special:Pages) which can be created by the Math extension for each Wikidata item.

Find a way to annotate Wikidata Items with Intent in Wikipedia page
Implement the Intent Grammar (similar to the MathML-Nodes currently), so MathML by TexVC(PHP) can contain Intent elements
For formulas with qids-semantic-annotations, fetch the SpecialPage, check for intent annotations
The MathML is generated for each formula with TexVC, if there are intents in the special-page, they parsed (does this contain validation of the annotated attributes?) and added to the final MathML which is delivered to Screenreaders (also here, what happens if there are multiple nested qid-annotations for a formula): Alignment of Special Page content to MathML
Build a minimal test which checks the generation of all necessary elements in MathML for intent
Build a minimal test which checks the intent generation for the currently existing non-default intent items from Wikidata (these are 5-20 formula, these are the list items mentioned in II)
It should have 100% test coverage for the intent grammar.

Alternative plan is in subtask since 21.06.23: Instead of composing MathML with intents in Math extension, browser extension written in Javascript shall do this.

(IV) Checking the generated speech in Wikipages:

Speech to Text activation OR use Screenreader capabilities OR find another recent and suitable program which can create intents
Define a set of cases and check the speech output, before and after our implementations
https://phabricator.wikimedia.org/T327394: Evaluate Speech Output for MathML with Intent attributes ( to be clarified, but i think this would be enough for the current scope)

I think this has to be spelled out elswere. Maybe It would be good project to do in collaboration with a developer of a screenreader software.

Details

	Subject	Repo	Branch	Lines +/-
	Fix exceptions thrown by LocalChecker	mediawiki/extensions/Math	master	+13 -13

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Open		None	T327098 Implement Math Accessibility Features necessary for Intent attribute usage
Open		None	T302628 Implement native MathML rendering in Math
Open		Physikerwelt	T310211 Deliver visible MathML to the browser
Resolved		Stegmujo	T331047 Implicitly generate Math elements in MathML from the TexVC parsetree
Resolved		mmartorana	T354136 Application Security Review Request: MathJax
Resolved		Physikerwelt	T360176 Replace -s with _s in i18n math mode strings
Resolved		Stegmujo	T311606 merge texvcinfo into texvcjs
Duplicate		Stegmujo	T311809 Delete texvcinfo content
Resolved		Stegmujo	T312528 Convert Grammar related javascript to php in texvcjs
Resolved		Stegmujo	T315223 Create testrunner for running the Wikipedia tests locally with json file
Resolved		Stegmujo	T315978 Reminder Final Checks in Implementation
Duplicate		None	T316422 PHP Coding conventions: 3 Backlsashes?
Declined		None	T316423 PHP Coding conventions: 3 Backslashes?
Open		None	T319521 Add Unicode test from tests/All.js and fix it in php
Declined		None	T319522 Add Ocaml checks to AllTest.php
Open		Stegmujo	T320964 Remove buildParserPHP regexes when phpeggy is fixed
Stalled		None	T321060 In TexVC PHP fix SyntaxError information
Resolved		None	T321262 Add More Types to TexVC PHP
Resolved		Stegmujo	T321599 Improve Perfomance of TexVC PHP in Math
Resolved		Physikerwelt	T333973 performance: consider removing curly
Resolved		Stegmujo	T323554 Change Validator to newDefaultValidator
Resolved		Physikerwelt	T332183 Fix Grammar for parsing operators with parentheses
Resolved		Stegmujo	T346731 fix escaping not correct for "<" or ">"
Resolved	BUG REPORT	Stegmujo	T347320 CI test error: already defined testUnderbrace function
Resolved		Stegmujo	T348936 Don't call trim(null)
Resolved		Stegmujo	T348975 Implement a configuration flag which can choose rendering for chemistry formulas
Resolved		Stegmujo	T348976 Add a cache purging action for url
Resolved		Stegmujo	T317026 Implement Interface for Running TexVC in Mathextension PHP
Resolved		Stegmujo	T312089 Enable generation of php-parsers in texvcjs
Duplicate		Stegmujo	T312529 Create MML interfaces for texvcjs
Duplicate		Stegmujo	T312530 Create only MathML setting in Math Extension
Declined		Stegmujo	T312762 Check technologies for LateX to MathML conversion
Declined		Stegmujo	T310372 Create batch rendering abstraction
Resolved		Stegmujo	T327386 Create a Full Coverage Test for TexVC(PHP)
Resolved		Stegmujo	T327388 Generate and Update Reference MathML for the current Tests
Resolved		None	T327391 Fix the TexVC(PHP) Parse tree related cases
Resolved		Stegmujo	T327392 Implement Parsing functionality and mappings for full MathML coverage
Resolved		Stegmujo	T327393 Implement a comparison algorithm for automated testing which can compare different MathML outputs
Resolved		Physikerwelt	T346584 Enable native MathML rendering mode in beta
Resolved		Physikerwelt	T346795 User Feedback to MathML implementation in Math extension
Duplicate		None	T348615 Implement User Feedback to Round One
Resolved		Stegmujo	T348791 native: \begin{align} should align left
Resolved		Stegmujo	T348793 native: \left( and friends should generate stretchy mo-elements
Resolved		Stegmujo	T348971 Create correct sized fences around smallmatrix
Resolved		Stegmujo	T349822 Remove mstyle around spacing elements
Resolved		Stegmujo	T349825 Fix incorrect child counts in MathML for some elements
Resolved		Stegmujo	T349906 Fix error value in columnspacing attribute in mtable
Resolved		Stegmujo	T350021 Add invisible apply character after functions
Resolved		Stegmujo	T350491 Add space in liminf and limsup
Resolved		Stegmujo	T350735 native: \tbinom{n}{k} is broken
Open		Stegmujo	T350736 native: align environement still not optimal
Open		Stegmujo	T350737 native:chem expression <chem>A ->[{}\atop\ce{+H2O}] B</chem> fail
Resolved		Stegmujo	T350738 native: distinction in bracket sizes ( \bigl( \Bigl( too small?
Resolved	BUG REPORT	Stegmujo	T352196 native: Rendering error in Letters written in Blackboard bold
Resolved	BUG REPORT	Stegmujo	T352536 native: Rendering for MathCal not working on Chrome
Open	BUG REPORT	None	T352608 native: Rendering error - Align is not left-aligned in chrome
Resolved	BUG REPORT	Physikerwelt	T352609 native: \operatorname effect does not propagate into \widetilde arguments
Resolved	BUG REPORT	Physikerwelt	T352697 native: Rendering error in powers and indices
Resolved	BUG REPORT	Physikerwelt	T352698 native: Rendering error in overline, the line is too short in Chrome
Resolved	BUG REPORT	Physikerwelt	T352699 native: Rendering error in \operatorname{erf}^{-1}
Resolved	BUG REPORT	Physikerwelt	T353340 native: Rendering error rm font does not propagate into child nodes
Resolved		Physikerwelt	T355999 native: Check debug console on Help:MathTestNative
Resolved		Physikerwelt	T357343 Incorrect mo in \sin \log_a
Resolved		Physikerwelt	T350787 Deploy native on a few pilot wikis
Resolved	BUG REPORT	Stegmujo	T351850 \nolimits renders incorrectly in native MathML
Resolved		Stegmujo	T351907 Closing bracket size too small
Invalid		None	T352735 Ensure that \bigl and friends are closed with corresponding right elements
Open		None	T327394 Evaluate Speech Output for MathML with Intent attributes
Open		None	T328752 Implement a comparison algorithm which can compare different MathML outputs for scientific results
Open		None	T329620 Enable Chem support in TexVC(PHP) for MathML generation
Resolved		Physikerwelt	T340023 Add additional statements from mhchemParser to texvc-php grammar
Open		Stegmujo	T340024 Create MathML translation for additional statements
Resolved	BUG REPORT	Daimona	T342100 Phan is stopping for unknown reason in automated checks
Open		None	T348744 Define and Implement behaviour for chem/math environments
Open		None	T348846 Fix appearance of some chem macros
Open		Stegmujo	T331998 Fix erroneous testcases detected by visual comparison
Open		Stegmujo	T340028 Check Browser Extension based Intent Mapping

Event Timeline

Stegmujo created this task.Jan 16 2023, 4:45 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 16 2023, 4:45 PM

Stegmujo updated the task description. (Show Details)Jan 16 2023, 4:46 PM

Stegmujo updated the task description. (Show Details)Jan 16 2023, 4:48 PM

Stegmujo updated the task description. (Show Details)Jan 16 2023, 4:51 PM

Stegmujo updated the task description. (Show Details)

Stegmujo updated the task description. (Show Details)Jan 16 2023, 4:55 PM

Stegmujo updated the task description. (Show Details)

@Physikerwelt could you have a look at this and check if everything is correct, something is missing and edit where necessary ?

Stegmujo updated the task description. (Show Details)Jan 16 2023, 5:07 PM

Physikerwelt added a project: Math.Jan 16 2023, 6:07 PM

done. added some comments. Overall I think it might be reasonable to break it down into some subtasks.

Physikerwelt added subtasks: T302628: Implement native MathML rendering in Math, T321599: Improve Perfomance of TexVC PHP in Math.Jan 16 2023, 6:27 PM

Thanks for adding the hints, here some questions and remarks (to @Physikerwelt ) to refine the essence of the tasks:

section I:
- For generating MathML from Mathoid/LateXML for the testswith MathSearch-Extension i suppose to write maintenance script (or eventually a test) which does render MathML similar to here in testAlttext. Or is there any implementation i have overlooked (as it might be on an earlier branch) which does this? In case you know, any hints welcome.
- does the comparison algorithm contain some type of similarity score etc., how was performance measured earlier?
- "i don't understand that": The parsetree in TexVC(PHP) is currently in many cases not enough to create valid MathML, example cases can be found in this test-file in texvctreebugs, the ultimate solution is to refactor the grammar file so that the parsetree by TexVC(PHP) is correct for generating MathML.

section II:
- How can we use intents from the Wikidata items in this project (with the scope of the publication in mind) ? How can the intents be notated in these Wikidata items?
- How can a list with Wikipages which have annotated qids be found ?
- How can a list with Wikidata items which have intents notated be found ?

section III:
- If we read intent attributes which come from some kind of user-based annotation, is it required to validate the correctness of these attributes by methods in the math extension, before they get forwarded to the browser-users in the MathML?

section IV:
- how can we have working examples for speech synthesis from the output of our system, which is adding intent to MathML, for the upcoming publication ?

Aklapper added a project: Accessibility.Jan 17 2023, 1:46 PM

Stegmujo updated the task description. (Show Details)Jan 17 2023, 2:17 PM

Stegmujo updated the task description. (Show Details)Jan 17 2023, 2:28 PM

In T327098#8530955, @Stegmujo wrote:

Thanks for adding the hints, here some questions and remarks (to @Physikerwelt ) to refine the essence of the tasks:

Thank you.

section I:

For generating MathML from Mathoid/LateXML for the testswith MathSearch-Extension i suppose to write maintenance script (or eventually a test) which does render MathML similar to here in testAlttext. Or is there any implementation i have overlooked (as it might be on an earlier branch) which does this? In case you know, any hints welcome.

I was guessing one can start with the UpdateMath maintenance script that you updated recently. You just need put the formulae you want to test in your wiki and they will be found by the script. Thereafter you can get the MathML form the DB. So you probably don't need to implement a single line of code.

does the comparison algorithm contain some type of similarity score etc., how was performance measured earlier?

It was done based on images. Maybe this is too complicated. We can start with tree-edit distance as a similarity measure. But maybe we just want to figure out if same or different for at the moment?

"i don't understand that": The parsetree in TexVC(PHP) is currently in many cases not enough to create valid MathML, example cases can be found in this test-file in texvctreebugs, the ultimate solution is to refactor the grammar file so that the parsetree by TexVC(PHP) is correct for generating MathML.

I still don't understand, let's discuss that in a f2f meeting.

section II:

How can we use intents from the Wikidata items in this project (with the scope of the publication in mind) ? How can the intents be notated in these Wikidata items?

We need to figure this out.

How can a list with Wikipages which have annotated qids be found ?

We could search for them, but why do we need this list?

How can a list with Wikidata items which have intents notated be found ?

The answer is the same as above.

section III:

If we read intent attributes which come from some kind of user-based annotation, is it required to validate the correctness of these attributes by methods in the math extension, before they get forwarded to the browser-users in the MathML?

Somehow. I don't see the practical implication.

section IV:

how can we have working examples for speech synthesis from the output of our system, which is adding intent to MathML, for the upcoming publication ?

We produce examples of valid MathML with intents and present them at the W3C MathWG meeting.

I was guessing one can start with the UpdateMath ...

Ok, this is a start, when this is set up and generated with the UpdateMath file, the only difficulity i see is to filter the correct Wikipages from the output (depending on the content of Database), there can be lots of items processed by this scripts). But this should be possible on a local instance with MediaWiki etc which only has the most necessary Wikipages.

It was done based on images ....

Ok, i guess either Text-Based (i.e. tree-edit distance) or Image comparison can be done, let's see what the compared outputs look like. If there are many differing artifacts within the tool specific MathML notations, image comparison might make more sense.

want to figure out if same or different for at the moment?

To create an accurate estimation of the effort in the task. This will enable us to keep up with deadlines realistically. Also, to make effective planning ahead.

I still don't understand, let's discuss that in a f2f meeting.

Ok, agree, I uploaded a html-file here which has the MathML of the erroneous cases. Preliminary to the f2f-meeting, looking at the MathML for sideset case will create some understanding.

MMLGenerationTest2-Output.html6 KBDownload

We could search for them, but why do we need this list?

For both lists ... to have practical example data for the feasibility of the complete created computer system in the scope of publication.

Somehow. I don't see the practical implication.

Validation of intent-attributes to the user would basically have security implications (similar to validation of LaTeX), no harmful script etc. could be annotated as intent-attribute which is then forwarded to the browsers and screen readers. Maybe there is already a way in Wikidata to mitigate such cases generically. Another reason would be to have 'valid' intent attributes processed by Math extension / screen readers, which are definitely machine-readable.

How can we use intents from the Wikidata items in this project (with the scope of the publication in mind) ? How can the intents be notated in these Wikidata items?

I think this is a major thing to figure out to have a complete overview on future efforts. Any starting pointers? For a start clarifying this, see the next comment.

We produce examples of valid MathML with intents and present them at the W3C MathWG meeting.

There might be a simple solution (i.e. CLI-tool producing spoken language as text) which is already synthesizing speech with intents, this would be enough to have a proof of concept.
Maybe they know a tool which creates that. I think for a publication, it would be somewhat necessary to have a simple 'proof' that there is such a tool which reads the generated output already and improves speech synthesis with that.

MathCat seems to be a suitable library for testing the speech generation from MathML with Intent attributes. (Edit: Moved the initial evaluation of MathCat here

Stegmujo updated the task description. (Show Details)Jan 18 2023, 10:29 AM

Stegmujo updated the task description. (Show Details)

Clarification of the open question :

How can we use intents from the Wikidata items in this project (with the scope of the publication in mind) ? How can the intents be notated in these Wikidata items?

To have a foundation for discussions, here is an example (for clarification of the process) with ambiguities from W3C Accessibility gap analysis:
TeX:

(0 , 5 )

"The Point could be an open interval, gcd, cycle, or an ordered tuple, vector etc."

MathML (already with intent, which resolves the ambiguity to a coordinate point):
See also intent reference.

<mrow intent="point($1,$2)">
  <mo>(</mo>
  <mi arg="1">0</mi>
  <mo>,</mo>
  <mi arg="2">5</mi>
  <mo>)</mo>
</mrow>

Some selected Wikidata QID's for ambiguity resolution:
https://www.wikidata.org/wiki/Q44946
https://www.wikidata.org/wiki/Q3250736

As I understand, the flow of data (from formula notation to formulas being sent to the screen reader) is this:

The TeX (for the example point-formula) gets written to the source of a Wiki page
Since this TeX is ambiguous, an annotation with the clarifying qid (Q3250736) gets added with AnnoMathTex by a user, probably with the 'Formula Annotation' dialogue, not completely clear how to get the coordinate-QID here. For example in the formula $ (x,y) $ on Cartesian coordinate system Wikipage. Just for testing i annotated the formula with another suggested QID here. Saved the page then.
To be clarified: Where does the annotated Wikitext appear from the WMFLabs Annomathtex (which Wikipedia URL)?
Edit after clarifications to AnnoMathTex: Currently AnnoMathTex does not generate the annotated Wikitext with the QID, this would have to be implemented with a python script etc which combines annotations from one file to the Wikitext source.
The pop-ups (which are running on Wikipedia-Beta-Cluster) can be used to annotate math-items with qids. So the wikidata items are connected. Here the open questions, how to proceed with nested formulas and how to add annotations
To go on here, the ideal case (from AnnoMathTeX paper) is assumed: having formula as this from AnnoMathTex in the Wikitext with the source : <math display="block" qid=Q3250736> ( 0 , 5 ) </math>
To come from associated Wikidata item to the intent format, I see these possibilities:
1. AnnoMathTeX directly annotates with intent as a source (since it has multiple sources), this is my difficulty with this: AnnoMathTex annotates TeX-Formula, intents are within MathML.
2. The associated Wikidata Item holds the Intent information in some property, some other item https://www.wikidata.org/wiki/Q204819 this has a property for 'defined formula' the formula is in TeX, Intent attribute is a MathML feature, so holding intent information would require to have the MathML of the formula also in Wikidata. There is an 'in defining formula' property which can hold annotations for each element in the TeX(!)-Formula. The 'in defining formula' property enables to annotate 'atomar'-elements (like E->Energy, M->Mass), but as i see it not the complete formula.
3. In Math extension, TexVC (PHP) we implement a mapping from known QIDS to intents (i.e. as json-file), this somewhat removes the user-based annotation aspect for intent content, but might be sufficient for a working prototype.
4. Wikidata items hold MathML, MathML can be edited and intent added. I guess, this would be the simplest solution, but users could insert mistakes while editing. To my knowledge, this is not implemented yet.

1. Looking forward to reading from another possibility for this step not mentioned here
In an ideal case, MathML with Intent is delivered for a formula from here. Then, since mapped qid-Formula from Wikidata and formulas in Wikitext can differ (example Point ( 0 , 1 ) maps to wikidata notation ( x , y )) the MathML from Wikidata has to be processed in Math extension to obtain the final MathML for the users.

On the foundation of the example here, @Physikerwelt how would you proceed in making intent available to browsers/screenreaders? Do you consider the flow of data correct? Which possibility for resolution do you see for Point 5 in 'flow of data'?

Stegmujo updated the task description. (Show Details)Jan 19 2023, 10:55 AM

Stegmujo updated the task description. (Show Details)

Stegmujo updated the task description. (Show Details)Jan 19 2023, 11:07 AM

Stegmujo updated the task description. (Show Details)

Stegmujo updated the task description. (Show Details)Jan 19 2023, 11:11 AM

Stegmujo updated the task description. (Show Details)Jan 19 2023, 11:29 AM

Stegmujo updated the task description. (Show Details)Jan 19 2023, 11:51 AM

Stegmujo updated the task description. (Show Details)Jan 19 2023, 12:00 PM

Stegmujo updated the task description. (Show Details)Jan 19 2023, 12:05 PM

Stegmujo updated the task description. (Show Details)Jan 20 2023, 12:50 PM

→ It shows how to annotate terms within a formula with other Wikidata-Concepts. Not directly clear how to add intents.

Stegmujo updated the task description. (Show Details)Jan 30 2023, 12:08 PM

Stegmujo updated the task description. (Show Details)Feb 3 2023, 11:33 AM

Stegmujo updated the task description. (Show Details)Feb 6 2023, 2:44 PM

Stegmujo updated the task description. (Show Details)Feb 14 2023, 12:42 PM

Stegmujo updated the task description. (Show Details)

Stegmujo updated the task description. (Show Details)Feb 27 2023, 11:16 AM

Stegmujo updated the task description. (Show Details)Feb 27 2023, 11:28 AM

Change 892461 had a related patch set uploaded (by Stegmujo; author: Stegmujo):

[mediawiki/extensions/Math@master] Fix LocalChecker exception throwing

https://gerrit.wikimedia.org/r/892461

gerritbot added a project: Patch-For-Review.Feb 27 2023, 2:50 PM

Stegmujo updated the task description. (Show Details)Feb 27 2023, 3:31 PM

Change 892461 merged by jenkins-bot:

[mediawiki/extensions/Math@master] Fix exceptions thrown by LocalChecker

https://gerrit.wikimedia.org/r/892461

ReleaseTaggerBot added a project: MW-1.40-notes (1.40.0-wmf.26; 2023-03-06).Feb 28 2023, 3:00 PM

Maintenance_bot removed a project: Patch-For-Review.Feb 28 2023, 3:10 PM

https://www.gipp.com/wp-content/papercite-data/pdf/scharpf2018.pdf

Mapping QID to Math element needs resolution of items within tex see this paper

Annotation would be necessary to reach this, like this in Wikitext:

<math qid=12345 > \sqrt{a1} </math> 

<math qid=12345 > \sqrt{a2} </math> 

<math qid=12345 > \sqrt{a3} </math> 

<math qid=12345 > \sqrt{a4} </math> 

<math qid=12345 > \sqrt{a5} </math> <!- "<annotation>\w{Q11423}{m} \w{Q2111}{c}^2</annotation>" -->
<math> end </math>