Page MenuHomePhabricator

Expectation (readQueryRows <=) 10000 by MediaWiki::main not met (actual: 20736): query-m: SELECT gemm_mentee_id AS `value` FROM `growthexperiments_mentor_mentee` WHERE gemm_mentor_id = N
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
normalized_message
Expectation (readQueryRows <=) 10000 by MediaWiki::main not met (actual: {actual}):
{query}
exception.trace
from /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/TransactionProfiler.php(444)
#0 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/TransactionProfiler.php(272): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated(string, Wikimedia\Rdbms\GeneralizedSql, integer)
#1 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/Database.php(1516): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion(Wikimedia\Rdbms\GeneralizedSql, double, boolean, integer)
#2 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/Database.php(1398): Wikimedia\Rdbms\Database->executeQueryAttempt(string, string, boolean, string, integer)
#3 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/Database.php(1323): Wikimedia\Rdbms\Database->executeQuery(string, string, integer)
#4 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/Database.php(2012): Wikimedia\Rdbms\Database->query(string, string, integer)
#5 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/Database.php(1874): Wikimedia\Rdbms\Database->select(string, array, array, string, array, array)
#6 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->selectFieldValues(string, string, array, string)
#7 /srv/mediawiki/php-1.38.0-wmf.3/includes/libs/rdbms/database/DBConnRef.php(317): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#8 /srv/mediawiki/php-1.38.0-wmf.3/extensions/GrowthExperiments/includes/Mentorship/Store/DatabaseMentorStore.php(115): Wikimedia\Rdbms\DBConnRef->selectFieldValues(string, string, array, string)
#9 /srv/mediawiki/php-1.38.0-wmf.3/extensions/GrowthExperiments/includes/Mentorship/MentorFilterHooks.php(165): GrowthExperiments\Mentorship\Store\DatabaseMentorStore->getMenteesByMentor(User)
#10 /srv/mediawiki/php-1.38.0-wmf.3/extensions/GrowthExperiments/includes/Mentorship/MentorFilterHooks.php(128): GrowthExperiments\Mentorship\MentorFilterHooks->getUnstarredMenteeIds(User)
#11 /srv/mediawiki/php-1.38.0-wmf.3/includes/changes/ChangesListFilter.php(354): GrowthExperiments\Mentorship\MentorFilterHooks->GrowthExperiments\Mentorship\{closure}(EnhancedChangesList, RCCacheEntry)
#12 /srv/mediawiki/php-1.38.0-wmf.3/includes/changes/ChangesList.php(256): ChangesListFilter->applyCssClassIfNeeded(EnhancedChangesList, RCCacheEntry, array)
#13 /srv/mediawiki/php-1.38.0-wmf.3/includes/changes/ChangesList.php(229): ChangesList->getHTMLClassesForFilters(RCCacheEntry)
#14 /srv/mediawiki/php-1.38.0-wmf.3/includes/changes/EnhancedChangesList.php(618): ChangesList->getHTMLClasses(RCCacheEntry, boolean)
#15 /srv/mediawiki/php-1.38.0-wmf.3/includes/changes/EnhancedChangesList.php(814): EnhancedChangesList->recentChangesBlockLine(RCCacheEntry)
#16 /srv/mediawiki/php-1.38.0-wmf.3/includes/changes/EnhancedChangesList.php(100): EnhancedChangesList->recentChangesBlock()
#17 /srv/mediawiki/php-1.38.0-wmf.3/includes/specials/SpecialWatchlist.php(603): EnhancedChangesList->recentChangesLine(RecentChange, boolean, integer)
#18 /srv/mediawiki/php-1.38.0-wmf.3/includes/specialpage/ChangesListSpecialPage.php(1665): SpecialWatchlist->outputChangesList(Wikimedia\Rdbms\MysqliResultWrapper, FormOptions)
#19 /srv/mediawiki/php-1.38.0-wmf.3/includes/specialpage/ChangesListSpecialPage.php(660): ChangesListSpecialPage->webOutput(Wikimedia\Rdbms\MysqliResultWrapper, FormOptions)
#20 /srv/mediawiki/php-1.38.0-wmf.3/includes/specials/SpecialWatchlist.php(125): ChangesListSpecialPage->execute(NULL)
#21 /srv/mediawiki/php-1.38.0-wmf.3/includes/specialpage/SpecialPage.php(647): SpecialWatchlist->execute(NULL)
#22 /srv/mediawiki/php-1.38.0-wmf.3/includes/specialpage/SpecialPageFactory.php(1375): SpecialPage->run(NULL)
#23 /srv/mediawiki/php-1.38.0-wmf.3/includes/MediaWiki.php(314): MediaWiki\SpecialPage\SpecialPageFactory->executePath(string, RequestContext)
#24 /srv/mediawiki/php-1.38.0-wmf.3/includes/MediaWiki.php(925): MediaWiki->performRequest()
#25 /srv/mediawiki/php-1.38.0-wmf.3/includes/MediaWiki.php(559): MediaWiki->main()
#26 /srv/mediawiki/php-1.38.0-wmf.3/index.php(53): MediaWiki->run()
#27 /srv/mediawiki/php-1.38.0-wmf.3/index.php(46): wfIndexMain()
#28 /srv/mediawiki/w/index.php(3): require(string)
#29 {main}
Impact
Notes

Details

Request URL
https://ar.wikipedia.org/wiki/%D8%AE%D8%A7%D8%B5:%D9%82%D8%A7%D8%A6%D9%85%D8%A9_%D8%A7%D9%84%D9%85%D8%B1%D8%A7%D9%82%D8%A8%D8%A9?hidebots=*&hidepreviousrevisions=*&hidecategorization=*&hideWikibase=*&limit=*&days=*&enhanced=*&urlversion=*

Event Timeline

kostajh triaged this task as High priority.Oct 13 2021, 7:44 AM
kostajh created this task.

Unnormalized message (doesn’t look like it contains anything sensitive):

Expectation (readQueryRows <=) 10000 by MediaWiki::main not met (actual: 20736):
query-m:

SELECT gemm_mentee_id AS `value` FROM `growthexperiments_mentor_mentee` WHERE gemm_mentor_id = N

I guess there’s a user who’s entered as the mentor of 20k other users? (I tried looking into the table but only got errors that it doesn’t exist, so I must be missing something there.)

I tried looking into the table but only got errors that it doesn’t exist, so I must be missing something there.

It's on the external1 cluster.

Alright, thanks. There are indeed five mentors with >20k mentees and 7 more with >10k mentees.

This occurance of the issue is caused by @Tgr's RC filters to mentees, which (quite naturally) loads IDs of all the mentees to be able to filter by them.

I'm not sure how to fix this though. The "simplest" thing would be to suppress the warning (not sure how ATM), it doesn't look like a disruptive thing. Alternatively, I can rewrite DatabaseMentorStore to batch the queries to not exceed the 10k treshold.

Any suggestions?

Alternatively, I can rewrite DatabaseMentorStore to batch the queries to not exceed the 10k treshold.

That seems like a better solution, assuming 10k limit exists a constant somewhere that we could reference (I don't see it at first glance). We could sort those records by creation date so we are more likely to get user records that the mentor cares to see.

It also seems like the mentorship feature might need more thought around active/inactive mentees; conceptually I understand why the mentor has > 10,000 mentees, but in practice they would never be able to engage with so many people, so there might need to be an abstraction that allows us to getActiveMenteesByMentor that we could use here, and that function would get users who are making edits or have done some action in the last X days.

I suspect if reading more than 10K rows in a single query is a problem, doing it in multiple queries is going to be a problem as well in a user-facing request. I don't think batching has a performance advantage over a single query as long as it's a non-locking read. I guess this is one use case where having the mentor/mentee map in the main database cluster would have been convenient.

One possible approach would be to delete mentor-mentee relationships for mentees who don't meet the dashboard criteria (IIRC, more than one year old or more than two weeks old with no activity whatsoever) and rely on mentors being auto-assigned dynamically if needed?

Hello @Krinkle, we were discussing this issue internally, and we're not sure why there's the 10k limit. Does reading a lot of rows for recent changes filtering cause some issues? Or is it just a warning indicating "you're likely doing something incorrectly here"? If the latter, maybe it's wise to suppress the warning rather than to make any changes.

There's no cap on how many mentees a mentor can have, and it will grow more or less linearly over time, so regardless of what's the right limit, we *are* doing something incorrectly here.

One option, at least in the short term, would be to simply error out if the user has more than 10K mentees. The RCFilter interface doesn't have a way of returning errors from filters, but it could simply return an empty result set, and maybe show a notice about it on the mentor dashboard.

Urbanecm_WMF lowered the priority of this task from High to Medium.Nov 5 2021, 6:58 PM

Decreasing priority to Medium. This currently affects only arwiki, and AFAICS doesn't cause any regressions.

(I added a filter in Logstash to hide this on the Growth team dashboard, fyi.)

Change 739592 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] MentorFiltersHooks: Make it possible to disable unstarred filters

https://gerrit.wikimedia.org/r/739592

Change 739592 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] MentorFiltersHooks: Make it possible to disable unstarred filters

https://gerrit.wikimedia.org/r/739592

Change 740525 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Growth: Disable filtering by unstarred mentees at enwiki, fawiki

https://gerrit.wikimedia.org/r/740525

Change 740525 merged by jenkins-bot:

[operations/mediawiki-config@master] Growth: Disable filtering by unstarred mentees at enwiki, fawiki

https://gerrit.wikimedia.org/r/740525

Mentioned in SAL (#wikimedia-operations) [2021-11-22T09:15:54Z] <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 24b3a7769ca97e3ed951d77d911f41afae5e4136: Growth: Disable filtering by unstarred mentees at arwiki, enwiki, fawiki (T293182) (duration: 01m 04s)

I ran https://people.wikimedia.org/~urbanecm/growth-team/mentor-dashboard/M1-mentees-per-mentor.html to see how many mentees per mentor do our wikis have. It turns out that only three wikis (commit message mentions only two, sorry for that!) have at least one mentor with more than 10k mentees assigned: enwiki, fawiki and arwiki.

I uploaded patches that will disable filtering by unstarred mentees in recent changes for those three wikis. Unless we decide to backport https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/739592/, it will take effect together with wmf.11.

Maybe worth using a more restrictive limit (like 5K or 7.5K) so we don't have to worry about another wiki reaching the 10K limit soon.

Maybe worth using a more restrictive limit (like 5K or 7.5K) so we don't have to worry about another wiki reaching the 10K limit soon.

I don't think that's needed. The numbers shown at https://people.wikimedia.org/~urbanecm/growth-team/mentor-dashboard/M1-mentees-per-mentor.html are accumulated since we deployed mentorship to those wikis for the first time. For viwiki, fawiki and arwiki, that's quite some time. I hope we'll be able to do T295075 soon (within next couple of months), which should remove the need for any limit. If this recurrs, it should be easy to notice and easy to workaround thanks to the patches that are already uploaded&merged

Urbanecm_WMF added a subscriber: Etonkovidova.

@Etonkovidova I'm not sure if moving to QA at this point is a good move, but...this will be fixed as of wmf.11 in production. I can backport it to wmf.9 too if needed.

@Etonkovidova I'm not sure if moving to QA at this point is a good move, but...this will be fixed as of wmf.11 in production. I can backport it to wmf.9 too if needed.

Checked - there are no errors on wmf.12, and since it was WARNING level, it was ok to wait for wmf.12.