- Improve hook coverage so that metadata is generated and saved on POST requests for all circumstances where items are added to the PageTriage queue
- Allow graceful failure in the UI if metadata is missing
- Add fallback mechanism to generate and save metadata via the job queue
Scenarios to handle
As far as I can see, the following scenarios need to be addressed as part of this task/patchset:
1. New article is created (POST), user is redirected to view the article (GET) and metadata is not in replica
This scenario was reported in T154719: PageTriage opens master connection on GET for ArticleMetadata cache misses. PageTriageHooks::isArticleNew() is called to determine if noindex,nofollow robot policy should be set. isArticleNew() is a bit of a misnomer as is the string creation_date because we're looking at when the article was added to the PageTriage queue, not when it was created.
In any case for this scenario, the user has POSTed an article so metadata is in the process of getting compiled via a deferred update, but they are redirected to view the page and PageTriageHooks::isArticleNew() is called, but metadata isn't in the replica.
This request will not return any metadata for the new article. That's OK because it's not needed for the page view following page creation. The compiled metadata will be available in the replica shortly after this request.
The only problem I could see is that the robot policy for <meta name="robots" content="noindex,follow"/> would not be in Varnish, and I'm not sure about the process for invalidating the Varnish result for this page so that the robots policy can get set.
2. Move sandbox page to main namespace, then edit sandbox page to remove redirect
The page that had the redirect to the main namespace is in PageTriage queue, so it is returned by ApiPageTriageStats via PageTriageUtil::getArticleFilterStat(), and most importantly in ApiPageTriageList::execute() via $pages = self::getPageIds( $opts );. getPageIds() doesn't check to see if an individual page has metadata, just if it's in the queue. So ApiPageTriageList might return 15 results instead of 20, and then the UI thinks that there are no more articles to load.
- Additional coverage in the hooks implementation in the patchset for this task should ensure that metadata is compiled for the page.
- ApiPageTriageList has been updated to add a warning to the response with a list of pages that don't have metadata; the JS has been updated to look for this in the response and allow for loading more
3. Rollback an edit
This one is documented in T202735: Prevent article metadata compilation on rollback actions
Similar to what the patchset introduces for getMetadata() when the metadata can't be returned from the replica: if in a POST request, then wait for replica, compile, then save to the DB immediately, if in a GET request (as is the case with a rollback action) then don't do any compilation, just queue a job to compile later on.