Page MenuHomePhabricator

Implement features that allow for tracking and measuring machine-aided depicts usage/activity
Open, NormalPublic

Description

We have this:
Machine-aided depicts (MAD) is designed to happen on a special page on Commons OR via external clients that utilize the tag suggestion and voting features via API. Currently, there are no solid instrumentation plans for this feature.

We want this:
On top of what we have, implementation of whatever additional setup needed for instrumentation (tags to identify edits from this tool, etc.)

Acceptance Criteria:

  • An edit tag system that accurately identifies confirmed tags from this tool (for example in recent changes), and is easy to measure
  • Instrumentation capability that allows us to measure how often depicts tag additions from this tool are reverted
  • Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of reversion
  • Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of user confirmation/rejection
  • Instrumentation showing how often/which files are skipped

Details

Related Gerrit Patches:
mediawiki/extensions/MachineVision : masterTag reverted computer-aided tagging revisions
mediawiki/extensions/MachineVision : masterAdd depicts statement when a MAD suggestion is accepted

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 22 2019, 5:25 PM

I can add a machine-aided depicts tag (or similar) to every revision adding a depicts statement, as a start.

I can add a machine-aided depicts tag (or similar) to every revision adding a depicts statement, as a start.

That will be a great first start, thanks! Paging @mpopov for additional input/ideas :)

Ramsey-WMF renamed this task from [Stub] Implement features that allow for tracking and measuring machine-aided depicts usage/activity to Implement features that allow for tracking and measuring machine-aided depicts usage/activity.Aug 31 2019, 12:09 AM
Ramsey-WMF updated the task description. (Show Details)
Mholloway triaged this task as Normal priority.Aug 31 2019, 5:28 PM

This doesn't need to hold up technology reviews, but we will want it in place before launch.

Change 533978 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/extensions/MachineVision@master] Add depicts statement when a MAD suggestion is accepted

https://gerrit.wikimedia.org/r/533978

Change 533978 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Add depicts statement when a MAD suggestion is accepted

https://gerrit.wikimedia.org/r/533978

This comment was removed by mpopov.
mpopov added a comment.EditedSep 10 2019, 9:13 PM

I can add a machine-aided depicts tag (or similar) to every revision adding a depicts statement, as a start.

That will be a great first start, thanks! Paging @mpopov for additional input/ideas :)

Thank you, Michael! The great news is that an edit tag gets us the second criterion too since it's easy to query edits and their revert status from the mediawiki history monthly snapshots in our data lake. (Calculating revert rate on a daily basis manually is possible but much more difficult.)

For "cross-referencing the tag confidence scores" I'm curious if those are currently stored anywhere. Or is the task to figure out that storage?

For client-side analytics instrumentation, Jason and I are working on a set of cross-platform libraries for standardized and unified product analytics that would simplify this, BUT we won't have production-quality stuff until end of Q2 or Q3. Until then, MachineAidedDepictsUsage is a potential schema we can use and instrument for. It's a relatively simple impression/click-style design that would track:

  • name of service used
  • the confidence score
  • how long the suggestion took

for action = "receive" events. What the user does with the received suggestions is tracked in follow-up "confirm"/"reject" events which don't need to include that information.

I'm starting vacation tomorrow so in the meantime I'd love to get a review of that design from someone on Product Analytics team and also for @Mholloway to let me know whether the tag_id and page_id make sense/are doable. I'm not sure how much logging is going to happen on the backend that's going to act as the mediator between MediaWiki and external services like Google Vision API, so tag_id might have to be replaced with a randomly generated identifier. As for page_id, that's more of a performance concern because it's easier to send an integer than a filename string, but let me know if that would be problematic.

@Ramsey-WMF: Once the schema design has been given a thumbs up, who would be instrumenting it? I don't know that I'm able to work on this in an engineering capacity.

@mpopov Thanks for drafting a schema! Some thoughts:

  • page_id should be fine.
  • tag_id — it seems like you have in mind here either a row ID or some other arbitrary unique identifier, but since page_id is required, maybe we could just use the Wikidata ID string here? (Wikidata ID + image SHA1 digest uniquely identify a label suggestion in the underlying table.)
  • wait_time — is this the time elapsed between submitting the labeling request(s) and receiving a response, or the time between the user opening Special:SuggestedTags and the tags being loaded from the DB and presented? The former may or may not be interesting if we ultimately use Google as our MV provider, since we'll be planning to use batched, asynchronous labeling requests.
  • We'll also be storing voting results in the DB, as well as per-provider confidence scores (see here). Perhaps it's redundant to log these in EL? Actually, now that I look at the needs in the description more closely, reversion is the only piece that isn't accounted for in the DB schema.

As for your last question, I'm guessing the implementation will fall either to Product Infra or to the Structured Data PHP devs.

Taking this off kanban until the final schema is confirmed.

mpopov added a comment.Oct 2 2019, 2:58 PM

@Mholloway can I please take a look at some db tables that get created with Extension:MachineVision and some existing data in them?

Hey @mpopov,

There are (nearly-up-to-date) table schemas on wiki at https://www.mediawiki.org/wiki/Extension:MachineVision/Schema.

Here's some examples of them with data populated, from my local MW-Vagrant environment:

  • machine_vision_label
+--------+---------------------------------+-----------------+------------+-----------------+--------------------+-----------------+-------------------+
| mvl_id | mvl_image_sha1                  | mvl_wikidata_id | mvl_review | mvl_uploader_id | mvl_suggested_time | mvl_reviewer_id | mvl_reviewed_time |
+--------+---------------------------------+-----------------+------------+-----------------+--------------------+-----------------+-------------------+
|    118 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q1390           |         -1 |               1 | 15690146515989     |               1 | 15690146599604    |
|    127 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q43806          |         -1 |               1 | 15690146515989     |               1 | 15690146599691    |
|    136 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q156449         |         -1 |               1 | 15690146515989     |               1 | 15690146599614    |
|    145 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q938020         |          1 |               1 | 15690146515989     |               1 | 15690146599025    |
|    154 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q1745802        |         -1 |               1 | 15690146515989     |               1 | 15690146599633    |
|    172 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q2707760        |         -1 |               1 | 15690146515989     |               1 | 15690146599644    |
|    181 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q11946202       |          1 |               1 | 15690146515989     |               1 | 15690146598602    |
|    190 | 9w2x3j4ul5tz8hsthz1bno1bdz5haqu | Q1141466        |          1 |               1 | 15690146515989     |               1 | 15690146597438    |
+--------+---------------------------------+-----------------+------------+-----------------+--------------------+-----------------+-------------------+
  • machine_vision_suggestion
+------------+-----------------+----------------+----------------+
| mvs_mvl_id | mvs_provider_id | mvs_timestamp  | mvs_confidence |
+------------+-----------------+----------------+----------------+
|        194 |               3 | 20190920153620 |       0.910027 |
|        195 |               3 | 20190920153620 |       0.910027 |
|        196 |               3 | 20190920153620 |       0.823577 |
|        197 |               3 | 20190920153620 |       0.822089 |
|        198 |               3 | 20190920153620 |       0.781328 |
|        199 |               3 | 20190920153620 |       0.777754 |
|        200 |               3 | 20190920153620 |       0.761064 |
|        201 |               3 | 20190920153620 |       0.760972 |
|        202 |               3 | 20190920153620 |       0.734942 |
|        203 |               3 | 20190920153620 |       0.688029 |
|        204 |               3 | 20190920153620 |       0.686922 |
+------------+-----------------+----------------+----------------+
  • machine_vision_provider
+--------+----------+
| mvp_id | mvp_name |
+--------+----------+
|      3 | google   |
+--------+----------+

These schemas (particularly machine_vision_label and machine_vision_suggestion) may change as a result of T227355: DBA review for the MachineVision extension.

Mholloway updated the task description. (Show Details)Wed, Oct 23, 6:58 PM

Hey @mpopov, is this ready for dev? Or should I set up a meeting to talk about the best approach? It seems like the remaining items in the description can be derived from data being stored in MySQL, so I'm not sure EventLogging-style instrumentation is needed.

Mholloway raised the priority of this task from Normal to High.Wed, Oct 23, 7:00 PM

Looks like we will want an additional tag for when a CAT revision is reverted (i.e., undone or rolled back), which I will add.

Change 547323 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/extensions/MachineVision@master] Tag reverted computer-aided tagging revisions

https://gerrit.wikimedia.org/r/547323

The above patch adds a tag (computer-aided-tagging-revert) that is applied to reverted CAT revisions. How often CAT revisions are reverted can by dividing the count of revisions tagged with computer-aided-tagging by the count of revisions tagged with computer-aided-tagging-revert.

As for the remainder, to make my thinking explicit:

Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of reversion

It's probably best if I add a column to machine_vision_label to track whether the label was reverted, in addition to the above added tag; will follow up with another patch.

Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of user confirmation/rejection

Can be done with the existing DB tables: SELECT mvs_confidence, mvl_review FROM machine_vision_suggestion LEFT JOIN machine_vision_label ON mvs_mvl_id = mvl_id;

Instrumentation showing how often/which files are skipped

This is a good candidate for client-side event logging. @AnneT Do you have bandwidth to take on adding some event logging to the front end?

Mholloway lowered the priority of this task from High to Normal.Wed, Oct 30, 11:10 PM

Upon further reflection, lowering this to normal (but leaving in Kanban) since it won't become critical until the official feature launch.

Change 547323 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Tag reverted computer-aided tagging revisions

https://gerrit.wikimedia.org/r/547323

Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of reversion

It's probably best if I add a column to machine_vision_label to track whether the label was reverted, in addition to the above added tag; will follow up with another patch.

This is done. This can be calculated by dividing the number of revisions tagged computer-aided-tagging by the number tagged computer-aided-tagging-revert.

Mholloway updated the task description. (Show Details)Mon, Nov 4, 11:32 PM