In RunVariableGenerator and AFComputedVariable there are several code bits where a comment says that we can share a parse operation with the edit, see this search.
Back in 2009, (see rEABF48bfcc35ee9829cc15e9404c2cbe0d7bf86402b1), this was achieved by keeping a cache of Article objects, and calling Article::prepareTextForEdit (i.e. what was used at the time to parse the content) on those objects. Since sometimes AbuseFilter will parse the page content *before* saving the edit (e.g. if it needs to know what links are being added), this guarantees (or guaranteed) that the parsing didn't happen twice.
Today, 11 years later, things have changed a lot. Notably:
- We use WikiPage instead of Article
- The method is now called prepareContentForEdit
- We now have a dedicated PageUpdater service with the logic for preparing edits
- I'm unsure about this, but I guess that the cache for prepared edit now lives elsewhere, and perhaps it doesn't need to cache WikiPage/Article objects
- I think (but I may be wrong) that parsing the content can now happen *after* the edit is saved
All of this means that the AF code is most probably out of date. I'm tagging CPT as stewards of AbuseFilter and maintainers of the ContentHandler/MCR/Revision backend code which is the vital point here. What I'd like to understand is:
- Is there any point in keeping a cache of WikiPage objects?
- Will the parse operation be shared with the edit even if we don't
- What else can we do to ensure that the parse operation is shared, or anyway that we keep the code performant?
I should also say that a related change is to stop using WikiPage::prepareContentForEdit in favour of its MCR replacement. However, as per T242249, the method has been deprecated since 1.32 without a viable alternative (the recommended replacement is a private method).