As part of trying to figure out the best way of supporting language variants in Parsoid that are more data/table based, @cscott explored the use of [[https://en.wikipedia.org/wiki/Finite_state_transducer|Finite State Transducer]] formalism for this. Specifically, he explored the [[https://en.wikipedia.org/wiki/Foma_(software)|Foma]] based tools for this.
[[https://gerrit.wikimedia.org/r/#/c/423197/|This gerrit patch]] specifies language variant transformations for serbian, kurdish, piglatin. Followup patches tie these transformers with a DOM pass that transforms text nodes in the DOM via these foma-based transformers.
The larger goal is make language variant code more readable and maintainable as well as make it simpler to develop them. Looking at just the foma files, it appears that some of these goals can be met.
This task is to continue the discussion that began on the gerrit patch. I'm going to paste a few comments here that capture the initial discussion there.