Looking at the flame graphs of a jobrunner it appears that CirrusSearch jobs are taking most of the jobrunner resources.
Few ideas to improve the situation:
- [] verify that `ContentHandler::getParserOutputForIndexing()` is not asking to render the HTML output on wikidata
- [] disable the saneitizer for one week and assess the impact
- if the impact is big consider lowering the number of parses by making a dedicated profile for wikis like commons and increase `reindex_after_loops` from 8 to e.g. 16.
- [] verify that running the jobs for both eqiad & codfw re-use the parser output (no double parse)
- [] Consider using memcache (~6hours ttl) to hold the indexed content to be re-used by subsequent ElasticaWrite jobs running for cloudelastic
AC:
- reduce by X% the impact of CirrusSearch jobs on jobrunners