Page MenuHomePhabricator

Rework maintenance/listTaskCounts.php to use SuggestionsInfo
Open, In Progress, LowPublic

Description

maintenance/listTaskCounts.php, SpecialNewcomerTasksInfo.php and SuggestionsInfo.php do similar things but are not wired together.

We should rework listTaskCounts.php to use SuggestionsInfo.php and move a caching function into that class as well. Then listTaskCounts.php, which already runs as an hourly cron process, could also populate the cache for Special:NewcomerTasksInfo (see https://phabricator.wikimedia.org/T282985#7195032 for discussion)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Moving into current sprint as a lower priority item, but it's aligned with other work we want to do around monitoring, so IMHO it would make sense to do alongside those tasks.

Rough sketch of changes:

  • Pass WANObjectCache instance to SuggestionInfo in ServiceWiring, maintenance/listTaskCounts.php should instantiate SuggestionInfo
  • SuggestionsInfo::getInfo should check cache as well (via getWithSetCallback)
  • Normalize data for SpecialNewcomerTasksInfo & maintenance/listTaskCounts.php (make sure return value from getInfo is usable by both classes)
  • Both SpecialNewcomerTasksInfo & maintenance/listTaskCounts.php should get data via SuggestionsInfo::getInfo

Rough sketch of changes:

  • Pass WANObjectCache instance to SuggestionInfo in ServiceWiring, maintenance/listTaskCounts.php should instantiate SuggestionInfo
  • SuggestionsInfo::getInfo should check cache as well (via getWithSetCallback)
  • Normalize data for SpecialNewcomerTasksInfo & maintenance/listTaskCounts.php (make sure return value from getInfo is usable by both classes)
  • Both SpecialNewcomerTasksInfo & maintenance/listTaskCounts.php should get data via SuggestionsInfo::getInfo

Sounds good, maybe there should be a CachedSuggestionsInfo decorator, just like we have CachedDecorator for the TaskSuggester? Also, just to clarify,
maintenance/listTaskCounts.php should always bypass the cache when using SuggestionsInfo, but it should then set its results in the cache so that Special:NewcomerTasksInfo loads quickly.

Change 716586 had a related patch set uploaded (by MewOphaswongse; author: MewOphaswongse):

[mediawiki/extensions/GrowthExperiments@master] [WIP] Centralize logic for querying information about tasks & topics

https://gerrit.wikimedia.org/r/716586

Thanks for the clarifications @kostajh! I noticed that the task types for maintenance/listTaskCounts.php include an additional null task type and there's a FIXME to rework this area. Why is the extra null task type needed? I looked at OresTopicTrait.php but am still not sure why it's there.

// FIXME: Integrate with GrowthExperimentsSuggestionsInfo service, need a more robust
//   implementation of OresTopicTrait functionality first.

With my changes so far I see that this null task type is now showing up when SpecialNewcomerTasksInfo.php is using the cached information generated via maintenance/listTaskCounts.php

Screen Shot 2021-09-02 at 3.03.00 PM.png (560×1 px, 110 KB)

Thanks for the clarifications @kostajh! I noticed that the task types for maintenance/listTaskCounts.php include an additional null task type and there's a FIXME to rework this area. Why is the extra null task type needed? I looked at OresTopicTrait.php but am still not sure why it's there.

Good question... the null task type + handler were added in rEGRE52041f7b7895: Maintenance script for updating link recommendations with the purpose to provide a fake task type to select non-link-recommendation tasks. Then rEGRE8820def961c8: Make ORES topic search hack into a trait reworked some of this code into a trait. I admit I don't readily see why we need the null task type handling in listTaskCounts.php:

$nullTaskType = NullTaskTypeHandler::getNullTaskType( '_null' );
$this->replaceConfigurationLoader( $this->topicType === 'ores', [ $nullTaskType ] );

$growthServices = GrowthExperimentsServices::wrap( MediaWikiServices::getInstance() );

$allTaskTypes = array_keys( $growthServices->getNewcomerTasksConfigurationLoader()->getTaskTypes() );
$taskTypes = $this->getOption( 'tasktype', $allTaskTypes );
if ( array_diff( $taskTypes, $allTaskTypes ) ) {
	$this->fatalError( 'Invalid task types: ' . implode( ', ', array_diff( $taskTypes, $allTaskTypes ) ) );
}
$taskTypes = array_diff( $taskTypes, [ '_null' ] );

@Tgr could you clarify please? Removing the null task type and passing an empty array to $this->replaceConfigurationLoader() seems to work fine.

// FIXME: Integrate with GrowthExperimentsSuggestionsInfo service, need a more robust
//   implementation of OresTopicTrait functionality first.

With my changes so far I see that this null task type is now showing up when SpecialNewcomerTasksInfo.php is using the cached information generated via maintenance/listTaskCounts.php

Screen Shot 2021-09-02 at 3.03.00 PM.png (560×1 px, 110 KB)

Assuming we need the null task type in listTaskCounts, you'd probably need something like this in SuggestionsInfo::getInfo():

if ( $taskType === '_null' ) {
	continue;
}

The same code is in listTaskCounts::getStats()

The null task type is not needed (it's used in the refresh script to search for task candidates which are not tasks yet). The ability to use ORES topics instead of our custom ORES-based composite topics is needed, though.

The problem is OresTopicTrait is a bit of a hack that modifies the service container. That works in a maintenance script, not sure how well it would work on a special page. We might need to replace that with a nicer mechanism, such as letting TaskSuggesterFactory take a custom ConfigurationLoader as an optional parameter.

Urbanecm_WMF changed the task status from Open to In Progress.Jan 31 2022, 11:34 AM

Change 800291 had a related patch set uploaded (by MewOphaswongse; author: MewOphaswongse):

[mediawiki/extensions/GrowthExperiments@master] Use custom ConfigurationLoader instead of OresTopicTrait in listTaskCounts

https://gerrit.wikimedia.org/r/800291

Change 801386 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/GrowthExperiments@master] refreshLinkRecommendations: Use TopicDecorator and remove OresTopicTrait

https://gerrit.wikimedia.org/r/801386

Change 800291 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Use custom ConfigurationLoader instead of OresTopicTrait in listTaskCounts

https://gerrit.wikimedia.org/r/800291

Change 801386 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] refreshLinkRecommendations: Use TopicDecorator and remove OresTopicTrait

https://gerrit.wikimedia.org/r/801386

Change 716586 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Populate cache from listTaskCounts when querying growth topics

https://gerrit.wikimedia.org/r/716586