Page MenuHomePhabricator

Improve CachingParser performance
Open, Needs TriagePublic

Description

This task is an umbrella for several performance improvements to the new parser.

TODO: Tweak caching: try to avoid stampedes, increase TTL; also, try to cache all the ASTs at the same time and maybe use WANCache.

Event Timeline

Daimona created this task.Oct 2 2019, 11:27 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 2 2019, 11:27 AM

Change 540383 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Log deprecated vars in the cached phase in the new parser

https://gerrit.wikimedia.org/r/540383

Change 540145 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Replace array_map with foreach

https://gerrit.wikimedia.org/r/540145

Change 540385 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Remove AFPData::dup

https://gerrit.wikimedia.org/r/540385

Daimona claimed this task.Oct 2 2019, 12:32 PM
Daimona updated the task description. (Show Details)Oct 3 2019, 6:55 PM

Change 540145 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Replace array_map with foreach

https://gerrit.wikimedia.org/r/540145

Change 540385 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Remove AFPData::dup

https://gerrit.wikimedia.org/r/540385

XHGui is currently broken (T240066), but I think that the best we can do is cache all ASTs together. For a wiki as big as enwiki, this would mean one hit to memcached for every request, instead of 197. A possible implementation:

  • First of all, we should factor out a method like getAllFilters from AbuseFilterRunner::checkAllFilters
  • Then, we'd need a helper method to build a cache key from a list of local/global filters
  • In checkAllFilters, we first check if there's anything stored in that key.
    • If so, we retrieve the data (which would be a list of ASTs) and give the AST to the parser (as opposed to passing the bare content of the filter)
    • If the key is empty, we should set it at the end of checkAllFilters; we'd need a method to retrieve the AST from the parser, after the parsing is done.
  • Whenever a filter is modified, the key isn't valid anymore, so we have two options:
    • Invalidate the key altogether
    • Since, when saving a filter, we will also parse it to check syntax, we may selectively replace the AST for that filter and re-cache. Note, this would have to happen *after* the filter is saved, to avoid caching when the edit fails (maybe in a DeferredUpdate?)

For some of the steps above, we need new methods in the Parser class. The main problem is the presence of the old parser, which could force us to add some hacks.

Change 540383 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Log deprecated vars in the cached phase in the new parser

https://gerrit.wikimedia.org/r/540383

I will investigate whether it's feasible to cache all ASTs at the same time.