Currently, lint tables get updated when we use parsoid to parse page content via the parsoid extension endpoints. These get called from restbase when processing a cache pregeneration request coming from changeprop.
Since we want to remove parsoid code from restbase, we need an alternative mechanism.
current proposal
What we want is really similar to what RefreshLinksJob does. So the most straight forward approach is to allow the Linter extension to hook into RefreshLinksJob and perform a parsoid parse to update the linter tables. However, it should skip the parse if there is already up-to-date output in the ParserCache, to avoid duplicating the parses that we still trigger through RESTbase/changeprop, and the ones we will continue to do via the ParsoidCachePrewarmJob. It shoudl also skip the additional parse if the canonical rendering of the page was already done with parsoid.
Ida: In the future, linter data should be added to ParserOutput, like any other meta data.
original proposal
We could update the lint tables from ParsoidCachePrewarmJob. However, these jobs are not scheduled when pages get invalidated due to template updates, since in that case, we don't want to update the parser cache proactively.
To solve this, we should generalize the job to be a generic "ParsoidUpdateJob", which will parse page content and then optionally update the parser cache, links tables, lint tables, etc.
The job would be scheduled...
- when a page is edited, with both parser cache and lint table update enabled
- when a page is invalidate due to a template change, with only link/link table update enabled.
- when a page without parser cache entry is visited, with both parser cache and lint table update enabled