Page MenuHomePhabricator

Mathoid v0.7.0 not accepting chem formula
Closed, ResolvedPublic

Description

The chem formula {\\displaystyle {\\ce {NaNO3\\ +NaNH2->3NaOH\\ +NH}}} produces the following error in Mathoid v0.7.0:

{"success":false,"error":{"message":"Expected [a-zA-Z] but \" \" found.","expected":[{"type":"class","parts":[["a","z"],["A","Z"]],"inverted":false,"ignoreCase":false}],"found":" ","location":{"start":{"offset":27,"line":1,"column":28},"end":{"offset":28,"line":1,"column":29}},"name":"SyntaxError"}

However, Mathoid v0.5.x processes it successfully:

{"success":true,"checked":"{\\displaystyle {\\ce {NaNO3\\ +NaNH2->3NaOH\\ +NH}}}","requiredPackages":["mhchem"],"identifiers":["N","a","N","O","N","a","N","H","N","a","O","H","N","H"],"endsWithDot":false}

Event Timeline

mobrovac triaged this task as High priority.Dec 22 2017, 3:16 PM
mobrovac created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 22 2017, 3:16 PM

That seem to be a problem with the grammar. In this particular case \ .
In my test suite there were not too many chem examples, because I did not find many in the en-wiki dumps.
https://www.mediawiki.org/wiki/Extension:Math/T1835557
A list of all formulae with \ce in it would help to find more regressions.

A list of all formulae with \ce in it would help to find more regressions.

I am currently doing a dump of all formulae and will post a list with the erroneous chem ones after its completion.

Mentioned in SAL (#wikimedia-operations) [2017-12-22T18:28:30Z] <mobrovac@tin> Started deploy [mathoid/deploy@7c5f8e2]: Better handling of chem rules (codfw only) - T183557

Mentioned in SAL (#wikimedia-operations) [2017-12-22T18:31:15Z] <mobrovac@tin> Finished deploy [mathoid/deploy@7c5f8e2]: Better handling of chem rules (codfw only) - T183557 (duration: 02m 45s)

Thank you, @Physikerwelt for the quick fix! I deployed it, and it seems the error rate has dropped, so yay :) I will nevertheless keep the task open until the whole dump is done to verify.

The error still seems to persist. Moreover, out of the ~31k erroneous requests, ~14k of them fails on the texvcinfo check step, which is a regression since the old Mathoid version was able to check them. These errors need to be inspected and fixed before we can proceed with deploying the new version into production.

The problem is that the old verification added spaces in different places. Those spaces were sometimes not wanted, but sometimes required. Thus people started to adapt to this problem and fixed the spacing manually for instance by adding negative spaces or extra big spaces. With the first part of the fix the manual space correction is allowed (except for \; which would fix additional 2.2k cases. However, for the problem with missing spaces I have no good idea how to implement that. We could add a second phase that if a command fails in ce mode, we run the check with the old ce rule. However, this might lead to unexpected behavior and might confuse the final users.

The problem is that the old verification added spaces in different places. Those spaces were sometimes not wanted, but sometimes required. Thus people started to adapt to this problem and fixed the spacing manually for instance by adding negative spaces or extra big spaces.

It sounds like the best way to go about this would be to fix these formulae manually? Maybe we should track down the pages using them and put a notice out for this?

With the first part of the fix the manual space correction is allowed (except for \; which would fix additional 2.2k cases.

I'm not sure I understand fully which fix you are referring to here. So texvc doesn't allow extra spaces right now but did in the previous version? Could we have it strip extra spaces? Is that even possible?

However, for the problem with missing spaces I have no good idea how to implement that. We could add a second phase that if a command fails in ce mode, we run the check with the old ce rule. However, this might lead to unexpected behavior and might confuse the final users.

But the old ce rule still wouldn't normalise the formula in such a way as to be compatible with the new texvc rules, so I'm not sure that would help us in the long run.

It sounds like the best way to go about this would be to fix these formulae manually? Maybe we should track down the pages using them and put a notice out for this?

Exactly. See my pull request that generates a warning for these problems. I think warnings are a very good mechanism to get rid of unhandy syntax.

With the first part of the fix the manual space correction is allowed (except for \; which would fix additional 2.2k cases.

I'm not sure I understand fully which fix you are referring to here. So texvc doesn't allow extra spaces right now but did in the previous version? Could we have it strip extra spaces? Is that even possible?

I replaced \; locally with a regular space for testing... However, doing that without side effects might end in a nightmare. Thus I think this is not a good approach.

But the old ce rule still wouldn't normalise the formula in such a way as to be compatible with the new texvc rules, so I'm not sure that would help us in the long run.

No, it would not help in the long run. But for the time being.

Mentioned in SAL (#wikimedia-operations) [2018-01-03T11:29:48Z] <mobrovac@tin> Started deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557

Mentioned in SAL (#wikimedia-operations) [2018-01-03T11:32:03Z] <mobrovac@tin> Finished deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557 (duration: 02m 15s)

I reran the script for the check errors after deploying the fix for texvcjs, however none of the formulae passed the check. This is rather weird considering that the texvcjs package's tests contain them. We will need to investigate what is going on here. Perhaps a misunderstanding between texvcjs and texvcinfo ?

Mentioned in SAL (#wikimedia-operations) [2018-01-03T12:12:20Z] <mobrovac@tin> Started deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557

Mentioned in SAL (#wikimedia-operations) [2018-01-03T12:14:30Z] <mobrovac@tin> Finished deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557 (duration: 02m 10s)

Mentioned in SAL (#wikimedia-operations) [2018-01-04T10:14:14Z] <mobrovac@tin> Started deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557

Mentioned in SAL (#wikimedia-operations) [2018-01-04T10:16:52Z] <mobrovac@tin> Finished deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557 (duration: 02m 38s)

mobrovac closed this task as Resolved.Jan 4 2018, 10:36 AM
mobrovac assigned this task to Physikerwelt.
mobrovac edited projects, added Services (done); removed Services (doing).

All of the formulae pass the check test now with texvcinfo v0.5.2 and texvcjs v0.3.6. Thank you @Physikerwelt for the fix!