Page MenuHomePhabricator

Migrate IABot parsing code to Parsoid
Open, MediumPublic


There are 30+ bugs related to the interpretation of syntax in IABot. IABot uses its own parsing code.

My idea is to migrate much of the parsing code to a reliable external parsing library. Parsoid is an obvious choice.

The ideal outcome for this task is the creation of an abstract wiki document object that IABot would interact with, rather than applying regular expressions directly to text.

Event Timeline

Have you started this effort and/or are you still working on this?

I believe that the template parsing is a major issue with the bot (and bots in general). I have written a very small template editor with a proper linter/parser that is roundtrip safe, i.e., it preserves whitespaces, unknown fields etc. ( template_source = create_string(parse(template_source))) and does not rely on regular expressions to extract data. In particular, it can handle nested templates, which IABot seems to struggle with a lot.

It's currently written in Python, because I intend to use it with my Python-based bot to clean up some IABot issues, but it should be straightforward to port it to PHP. It might be less effort than to integrate Parsoid. Do you have a good understanding of the object that IABot needs?

Harej triaged this task as Medium priority.Feb 4 2021, 2:11 AM