Page MenuHomePhabricator

Support shelling out to mathoid instead of requiring a full server/restbase setup
Closed, ResolvedPublic

Description

In T74240: Remove PNG only rendering mode, it is proposed to remove the support for texvc OCaml binary, which doesn't support the full feature set of mathoid, which Wikimedia uses.

@Physikerwelt and I discussed alternatives to the traditional ocaml-based texvc for smaller wikis that don't want to or can't run a full blown mathoid service with restbase. One of the ideas was to have a CLI version of mathoid that could be shelled out to. This would allow us to maintain one codebase for rendering math, and most of the shell out code is already written for texvc.

Event Timeline

Legoktm created this task.Jan 12 2017, 11:24 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 12 2017, 11:24 PM

+1. It seems like a good way to go about it. That still presumes that people are able to install Mathoid with all of its dependencies, but that shouldn't be an issue in practice as they can install it under their user. They do need to have Node 4+, though.

Yep, and going forward I think node is going to be easier to install while ocaml will still be as weird as ever.

Legoktm updated the task description. (Show Details)Jan 21 2017, 4:28 AM
Isarra added a subscriber: Isarra.Jan 26 2017, 7:22 PM

Please make this happen. Fancy math rendering is a pretty common want on third-party projects, but very few of these are large enough to warrant the whole complicated heap as is. Certainly relative to the rest of the current process, mathoid is pretty trivial to install (yaourt -S mediawiki-mathoid-git and you're done, on arch; debian may be a bit harder, but it's still doable).

As for how to implement this, we currently have "mathoid" which is actually better named mathoid-server. We could make "mathoid" be just the library code with a CLI interface. Then we'll have a separate "mathoid-server" which depends upon mathoid and service-runner and what not. Does that seem reasonable?

@Legoktm mathoid was originally called mathoid-server. I have started to add a CLI to mathoid in https://gerrit.wikimedia.org/r/#/c/334420/1 , however this is just a rough draft up to now. The issue I see is that there is a significant startup time. Thus it would be much better if mathoid processes could be reused for serveral formuale.

Physikerwelt moved this task from Next-up to Doing on the Math board.Feb 14 2017, 4:37 PM

@mobrovac It's quite simple to create a cli which is super slow. However, I think if we spent the effort we should use our client server infrastructure. I did start my WIP change from here https://github.com/wikimedia/mathoid/blob/master/test/utils/server.js
But I did not manage to create a promise that is fulfilled after the server has started. It might be connected to the delayed startup of mathjax https://github.com/wikimedia/mathoid/blob/master/test/features/math/simple.js#L181-L186, but I get the impression that it does not work at all.
My approach was to create a promise that forks a new process, sends a message to the new process and that the promise is resolved once the server calls the callback specified in the message send from the client to the server process.
However, I get the feeling that I'm heading completly in the wrong direction.

Pkra added a subscriber: Pkra.Feb 14 2017, 9:22 PM

But I did not manage to create a promise that is fulfilled after the server has started. It might be connected to the delayed startup of mathjax

Just a thought: Have you considered running an empty call to mathjax-node and using the callback to let you know that things are up and running?

If a promise API on mathjax-node would help, please file an issue (I've been meaning to add one for a while but the v1.0 changes keep getting in the way).

If a promise API on mathjax-node would help, please file an issue (I've been meaning to add one for a while but the v1.0 changes keep getting in the way).

That would actually be awesome to have @Pkra and would be in line with the upcoming MathJAX 3.0 model too. Once we have that, it will be much easier to create a CLI version of Mathoid as we would need less glue and work-arounds. Would you have time for it? What would be the ETA?

@mobrovac It's quite simple to create a cli which is super slow. However, I think if we spent the effort we should use our client server infrastructure. I did start my WIP change from here https://github.com/wikimedia/mathoid/blob/master/test/utils/server.js

We need to examine why is the mathjax-node start-up time so long in the first place. Since this is the main component used, whatever we do it will still be slow.

But I did not manage to create a promise that is fulfilled after the server has started. It might be connected to the delayed startup of mathjax https://github.com/wikimedia/mathoid/blob/master/test/features/math/simple.js#L181-L186, but I get the impression that it does not work at all.

This should be easy to do once we have a Promise API in mathjax-node.

My approach was to create a promise that forks a new process, sends a message to the new process and that the promise is resolved once the server calls the callback specified in the message send from the client to the server process.
However, I get the feeling that I'm heading completly in the wrong direction.

I am not sure I understand. Why do you want to spawn another process for the actual work? What would be the benefit of it? MW already needs to start a new process for Mathoid, so you get a fresh environment in which all the work can be done.

Pkra added a comment.Feb 20 2017, 8:18 AM

This should be easy to do once we have a Promise API in mathjax-node.

I've filed https://github.com/mathjax/MathJax-node/issues/297. I'll push out v1.0-beta this week and then it will be easier to release more quickly. (Fair warning: lots of deprecation in v1.0 but nothing that should affect mathoid.)

If somebody has already tried wrapping a promise around the existing callback API (cf. the issue) then I'm wondering what the problem was.

Happy to chat directly (since I don't know much about mathoid's inner workings and would need a better understanding to help).

This should be easy to do once we have a Promise API in mathjax-node.

Maybe I have some misunderstanding, but it seems that currently mathoid is not using MathJax-node, instead it uses mathoid-mathjax-node. The latter one's lasted version is published at 2016-08-18. See ticket T170760

Change 370603 had a related patch set uploaded (by Physikerwelt; owner: Physikerwelt):
[mediawiki/services/mathoid@master] WIP: Update to MathJax-node 1.0

https://gerrit.wikimedia.org/r/370603

https://gerrit.wikimedia.org/r/370603 is the first attempt for a CLI prototype on the mathoid side. Data is exchanged via pipes or files, so no network access required. Batch processing is supported, so that the same mechanism as in the restbase interface can be used and that requests can be bundled. This is required to save processing time when MathJax node starts. What remains to be done is to change the Math extension to shell out to mathoid rather than to texvc.

The last open question to me is how to store the PNG images. It would be much easier to store them in the DB as well. That way admins of private wikis do not have to set up file permissions to use math. Another option is to drop support for PNG fallbacks for private wikis. The third option is to adapt the way image storing currently works from texvc.

Physikerwelt triaged this task as Normal priority.
Krinkle removed a subscriber: Krinkle.Aug 15 2017, 11:23 PM

Change 370603 merged by Mobrovac:
[mediawiki/services/mathoid@master] Update to mathoid-mathjax-node 0.7

https://gerrit.wikimedia.org/r/370603

Physikerwelt closed this task as Resolved.Feb 23 2018, 6:58 AM

As https://gerrit.wikimedia.org/r/#/c/372100/ was merged now also the client side is done.

Will that work for large wiki's? That would mean you would have to install mathoid on each server to be able to use it. But even then will it work if varnish magically decides which backend to use per request?