Page MenuHomePhabricator

Performance review for the MachineVision extension
Closed, ResolvedPublic

Description

Extension source: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/MachineVision

Description of extension

The MachineVision extension is being developed to support on-wiki usage of AI-generated image metadata. Specifically, it handles:

  • Requesting AI-generated image metadata from machine vision providers
  • Providing storage for AI-generated image metadata
  • Serving AI-generated data to users for verification (and promotion to Structured Data on Commons) and recording verification results

Though the expectation is that we will be working with third-party providers in the near term, it is designed to work with any number of internal or external machine vision providers.

The initial use case for the MachineVision extension is computer-aided tagging. Label suggestions will be requested from a third-party provider for a pool of existing, high-quality Commons images, the provided labels will be translated into Wikidata IDs, and users will have the opportunity to confirm or reject the suggested labels. Suggested labels will also be requested on upload as images are uploaded, and the uploaders will be notified of the opportunity to review suggested labels. SDC depicts statements will be added to images for approved labels.

Target date for enabling in production: week of October 28, 2019

Basic checklist:

  • Brief description of what it does as perceived by end-users (e.g. display X, offer X to use when doing Z)
  • Analysis of backend system performance and ensuring metrics/monitoring/grafana is in place. Assigned to: @aaron.
  • Analysis of perceived performance by end-users (e.g. load/save time, general responsiveness). Assigned to: @Krinkle.
  • Analysis of impact (if any) on site-wide metrics (startup bundle size, Wikipedia page load time). Assigned to: TBD.

Details

Related Gerrit Patches:
mediawiki/extensions/MachineVision : masterVarious performance review tweaks and comments
operations/mediawiki-config : masterMachineVision: Update Beta settings to (mostly) match production
mediawiki/extensions/MachineVision : masterresources: Scope RL package module to more specific directory
mediawiki/extensions/MachineVision : masterdocs: Avoid class name strings in PHP, spelling of MediaWiki
mediawiki/extensions/MachineVision : masterAvoid deprecated wfWikiID()
mediawiki/extensions/MachineVision : masterPerformance: Execute labeling requests in a Job, not a DeferredUpdate
mediawiki/extensions/MachineVision : masterAvoid generic 'moduleID' global variable
mediawiki/extensions/MachineVision : masterbuild: Misc clean ups for new repo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Jhernandez added a subscriber: Jhernandez.

Hey @Mholloway, can you tag the appropriate teams and add a description with links and what we need, with some approximate wanted date of deployment?

Mholloway updated the task description. (Show Details)Sep 10 2019, 1:09 AM

@Jhernandez I don't think there's anything to link to; this is just to give the Performance team a chance to look over the extension before initial deployment. I know @egardner wanted a chance to update some of the frontend JS code before that happens. I think we'll be ready to ping Performance by late this week or the beginning of next week.

Mholloway triaged this task as High priority.Oct 9 2019, 3:44 PM
Mholloway removed Mholloway as the assignee of this task.Oct 21 2019, 5:12 PM
Mholloway added a project: Performance-Team.

Hello Performance-Team, we'd like to request a review of the new MachineVision extension (https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/MachineVision) prior to enabling the extension in production. We hope to do that next week.

Sorry for the late notice relative to our intended deploy target; we've been waiting for development on the extension to get as near as possible to completion. Please let me know if that timeline is at all feasible. There are still some patches in review, but I think all of the performance-critical pieces are in place.

As far as I know, there's no specific process for this. Please let me know if there's a request template that I should be using or any additional info that you need.

Thank you!

Mholloway updated the task description. (Show Details)Oct 21 2019, 5:14 PM

Pinging @aaron about this since you worked on it last time (and it looks like Imarlier is the only current/former Performance person getting an email notification about it).

("last time" === the WikimediaEditorTasks performance review)

Gilles added a subscriber: Gilles.EditedOct 22 2019, 2:03 PM

Performance reviews need to be requested before the quarter starts, so we can plan capacity for it and add it to our own OKRs. Sending an email to performance-team@ or filing a task like this one while tagging Performance-Team is enough to get that conversation started.

Also, it's in your best interest to involve us sooner than when everything is fully written. The latter performance issues are identified, the more likely they will be blocking or significantly delaying deployment. It seems like you were already working on this last quarter and we could have done a first pass then (again, requested before the beginning of that quarter), increasing the likelihood that the final review before deployment is a formality less likely to reveal big problems.

Thanks for the response, @Gilles. Sorry I didn't request this earlier; it seems I badly miscalculated when it would be appropriate to involve your team. My assumption was that it would not be worth your time to look at code that was under very active development. (I searched the wikis for some guidance on when and how to request a performance review but didn't find any; that would help a lot.) Where does this leave us, then?

I will take a first look this week, but if it turns out to be time-consuming I will have to pass the torch to someone else, since I'm OOO next week.

Gilles claimed this task.Oct 22 2019, 2:26 PM

Thank you, @Gilles. The extension README should be up to date, but please let me know if you run into any setup issues.

FYI the Vagrant role for this extension fails to provision for me:

==> default: Warning: Unknown variable: '::mediawiki::wiki_name'. at /vagrant/puppet/modules/role/manifests/machinevision.pp:16:20
==> default: Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, 'regsubst' parameter 'target' expects a value of type Array or String, got Undef at /vagrant/puppet/modules/apache/manifests/site_conf.pp:37:19  at /vagrant/puppet/modules/role/manifests/machinevision.pp:15 on node vagrant.mediawiki-vagrant.dev
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

Sorry about that. I'm testing a fix for the role now.

Change 545305 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/MachineVision@master] build: Misc clean ups for new repo

https://gerrit.wikimedia.org/r/545305

Change 545307 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/MachineVision@master] Avoid generic 'moduleID' global variable

https://gerrit.wikimedia.org/r/545307

Two minor patches. Not a blocker by any means. Ignore me :)

It seems like the Vagrant role is very partial in what it does, compared to all the instructions on README.md. Could you provide a fully set up Docker or Vagrant image to save some time?

Change 545305 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] build: Misc clean ups for new repo

https://gerrit.wikimedia.org/r/545305

Change 545307 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Avoid generic 'moduleID' global variable

https://gerrit.wikimedia.org/r/545307

@Gilles After (a lot of) experimenting to find the most convenient way to set up all of the required dependencies and get the most usable result, I ended up creating a Docker-based environment for testing. Please clone https://github.com/mdholloway/machinevision-performance-review and follow the setup instructions at the top of the README file. (The setup script requires access to a separate private repo, to which I've added you as a collaborator.) Hopefully you will have better luck with that than with the Vagrant role.

Gilles removed Gilles as the assignee of this task.Oct 23 2019, 9:50 AM

Thanks, I will look for another volunteer on the perf team today as things have come up for the rest of the week and I don't have enough time left to do this properly.

I tried following the instructions and ./create ended with some errors:

PHP Fatal error:  Uncaught Exception: Unable to open file /var/www/mediawiki/skins/Vector/skin.json: filemtime(): stat failed for /var/www/mediawiki/skins/Vector/skin.json in /var/www/mediawiki/includes/registration/ExtensionRegistry.php:136
Stack trace:
#0 /var/www/mediawiki/includes/GlobalFunctions.php(89): ExtensionRegistry->queue('/var/www/mediaw...')
#1 /var/www/mediawiki/LocalSettings.php(4): wfLoadSkin('Vector')
#2 /var/www/mediawiki/includes/Setup.php(122): require_once('/var/www/mediaw...')
#3 /var/www/mediawiki/maintenance/doMaintenance.php(83): require_once('/var/www/mediaw...')
#4 /var/www/mediawiki/maintenance/update.php(278): require_once('/var/www/mediaw...')
#5 {main}
  thrown in /var/www/mediawiki/includes/registration/ExtensionRegistry.php on line 136

Fatal error: Uncaught Exception: Unable to open file /var/www/mediawiki/skins/Vector/skin.json: filemtime(): stat failed for /var/www/mediawiki/skins/Vector/skin.json in /var/www/mediawiki/includes/registration/ExtensionRegistry.php on line 136

Exception: Unable to open file /var/www/mediawiki/skins/Vector/skin.json: filemtime(): stat failed for /var/www/mediawiki/skins/Vector/skin.json in /var/www/mediawiki/includes/registration/ExtensionRegistry.php on line 136

Call Stack:
    0.2020     431848   1. {main}() /var/www/mediawiki/maintenance/update.php:0
    0.2090     712552   2. require_once('/var/www/mediawiki/maintenance/doMaintenance.php') /var/www/mediawiki/maintenance/update.php:278
    0.2112     872056   3. require_once('/var/www/mediawiki/includes/Setup.php') /var/www/mediawiki/maintenance/doMaintenance.php:83
    0.2790    4211744   4. require_once('/var/www/mediawiki/LocalSettings.php') /var/www/mediawiki/includes/Setup.php:122
    0.2832    4216312   5. wfLoadSkin() /var/www/mediawiki/LocalSettings.php:4
    0.2847    4279248   6. ExtensionRegistry->queue() /var/www/mediawiki/includes/GlobalFunctions.php:89

As a result the MediaWiki install is broken when I try to access it.

All right. Thanks for taking the time to look again, @Gilles. I've tweaked the setup script so that whoever looks next is less likely to hit a permissions issue.

Whoever picks this up, please ping me for access to the required private repo.

Gilles assigned this task to aaron.Oct 23 2019, 7:14 PM
Gilles moved this task from Inbox to Doing on the Performance-Team board.Oct 23 2019, 7:24 PM
Krinkle updated the task description. (Show Details)Oct 23 2019, 7:40 PM
Mholloway updated the task description. (Show Details)Oct 23 2019, 7:57 PM
Mholloway updated the task description. (Show Details)
Mholloway updated the task description. (Show Details)Oct 23 2019, 8:00 PM
Mholloway updated the task description. (Show Details)Oct 23 2019, 8:03 PM

Change 546141 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/extensions/MachineVision@master] [DNM] Various small performance review tweaks

https://gerrit.wikimedia.org/r/546141

aaron added a comment.Oct 30 2019, 5:22 PM

Aside from the things mentioned in the above patch, the overall code looks OK to me.

Thanks very much for the review, @aaron. I'll address the feedback tomorrow.

Change 547688 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/extensions/MachineVision@master] [WIP] Performance: Execute labeling requests in a Job, not a DeferredUpdate

https://gerrit.wikimedia.org/r/547688

Change 546141 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Various performance review tweaks and comments

https://gerrit.wikimedia.org/r/546141

Change 547688 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Performance: Execute labeling requests in a Job, not a DeferredUpdate

https://gerrit.wikimedia.org/r/547688

The last remaining FIXME was resolved by https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/MachineVision/+/548583/. Thanks again @aaron for the review!

Mholloway closed this task as Resolved.Nov 5 2019, 2:42 PM
Krinkle updated the task description. (Show Details)Nov 5 2019, 11:17 PM
Krinkle updated the task description. (Show Details)

Change 548924 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/MachineVision@master] docs: Avoid class name strings in PHP, spelling of MediaWiki

https://gerrit.wikimedia.org/r/548924

Change 548914 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/MachineVision@master] Avoid deprecated wfWikiID()

https://gerrit.wikimedia.org/r/548914

Change 548926 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/MachineVision@master] resources: Scope RL package module to more specific directory

https://gerrit.wikimedia.org/r/548926

Change 548914 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Avoid deprecated wfWikiID()

https://gerrit.wikimedia.org/r/548914

Krinkle reopened this task as Open.Nov 8 2019, 9:55 PM

Whoever picks this up, please ping me for access to the required private repo.

I thought maybe the ticket was missing some updates that were communicated elsewhere, but from what I understand our team was unable to install the extension. The review thus-far covered the backend and was based on static review of the code. Next steps is to figure out a minimal way to install (part of) it for frontend review.

What kind of operational metrics are in place currently?

Change 548924 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] docs: Avoid class name strings in PHP, spelling of MediaWiki

https://gerrit.wikimedia.org/r/548924

Mholloway added a comment.EditedNov 13 2019, 3:01 PM

Sorry for the confusion. The extension should not be difficult to install, per se, but there is a certain amount of work involved (as described in the README) in setting up its core functionality. Do you need to assess the entire label fetching and presentation flow, or do you just need to get the extension installed in order to, e.g, assess the effect of the additional JS modules on page load times?

From a user-facing perspective this is essentially just a Special page. It has no special operational metrics in place, but we'll be happy to add any you suggest.

aaron added a comment.Nov 13 2019, 4:55 PM

Whoever picks this up, please ping me for access to the required private repo.

I thought maybe the ticket was missing some updates that were communicated elsewhere, but from what I understand our team was unable to install the extension. The review thus-far covered the backend and was based on static review of the code. Next steps is to figure out a minimal way to install (part of) it for frontend review.

Right, I only did backend CR. Are you doing the front-end review?

Change 548926 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] resources: Scope RL package module to more specific directory

https://gerrit.wikimedia.org/r/548926

Krinkle claimed this task.Nov 26 2019, 7:45 PM

Whoever picks this up, please ping me for access to the required private repo.

I thought maybe the ticket was missing some updates that were communicated elsewhere, but from what I understand our team was unable to install the extension. The review thus-far covered the backend and was based on static review of the code. Next steps is to figure out a minimal way to install (part of) it for frontend review.

Right, I only did backend CR. Are you doing the front-end review?

I guess :)

Krinkle updated the task description. (Show Details)Nov 26 2019, 7:45 PM

Following up from the retrospective, my understanding is that the ideal is to use the Beta Cluster as the review environment. The MachineVision BC configuration is currently quite different from production but I'll work on updating it to match production now.

Change 554630 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[operations/mediawiki-config@master] MachineVision: Update Beta settings to (mostly) match production

https://gerrit.wikimedia.org/r/554630

Change 554630 merged by jenkins-bot:
[operations/mediawiki-config@master] MachineVision: Update Beta settings to (mostly) match production

https://gerrit.wikimedia.org/r/554630

OK: https://commons.wikimedia.beta.wmflabs.org/wiki/Special:SuggestedTags and uploaded image labeling on Beta are now fully functional. Please let me know if there's anything else I can do to facilitate the remaining review.

AnneT added a subscriber: AnneT.Dec 5 2019, 9:21 PM
Gilles added a comment.EditedDec 9 2019, 11:39 AM

I've done the minimal sanity check of frontend payload and see that the newly introduced RL module is correctly only loaded on the special page, making this very low-risk in terms of site-wide frontend performance regression. The feature itself on the special might have possible improvements, which I'll let @Krinkle look into, but this can roll out as-is.

I was a bit surprised to see that the Special page needs JS to work at all, but I don't know what the current policy about this is for a Special page. It would be nice to have at least some fallback text there for grade C browsers, currently the experience without JS is just a blank page.

Thank you, @Gilles. I'll make sure that some no-JS fallback text lands in wmf.10.

Krinkle closed this task as Resolved.Fri, Jan 24, 11:36 PM

This is now completed. I found no notable impact on other parts of the frontend or backend. The feature itself doesn't yet have a size or time budget, as far as I know, so its loading process was not audited in that way. I did notice an easy win during the load cycle, which I filed as T242667.

I would recommend at least keeping track of the size in CI, and keeping track of real-user load time with JavaScript instrumentation. For example, by using a 1:100 random sampling (export config via package file) and then use a 1-line Statsv call in JS to record the measure. This can then be visualised in Grafana and generate alert notifications (e.g. to your team e-mail) if it regresses.

Restricted Application added a project: Structured-Data-Backlog. · View Herald TranscriptFri, Jan 24, 11:36 PM