Page MenuHomePhabricator

Implement features that allow for tracking and measuring machine-aided depicts usage/activity
Open, NormalPublic

Description

We have this:
Machine-aided depicts (MAD) is designed to happen on a special page on Commons OR via external clients that utilize the tag suggestion and voting features via API. Currently, there are no solid instrumentation plans for this feature.

We want this:
On top of what we have, implementation of whatever additional setup needed for instrumentation (tags to identify edits from this tool, etc.)

Acceptance Criteria:

  • An edit tag system that accurately identifies confirmed tags from this tool (for example in recent changes), and is easy to measure
  • Instrumentation capability that allows us to measure how often depicts tag additions from this tool are reverted
  • Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of reversion
  • Instrumentation capability that allows cross-referencing the tag confidence scores and the frequency of user confirmation/rejection
  • Instrumentation showing how often/which files are skipped

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 22 2019, 5:25 PM

I can add a machine-aided depicts tag (or similar) to every revision adding a depicts statement, as a start.

I can add a machine-aided depicts tag (or similar) to every revision adding a depicts statement, as a start.

That will be a great first start, thanks! Paging @mpopov for additional input/ideas :)

Ramsey-WMF renamed this task from [Stub] Implement features that allow for tracking and measuring machine-aided depicts usage/activity to Implement features that allow for tracking and measuring machine-aided depicts usage/activity.Sat, Aug 31, 12:09 AM
Ramsey-WMF updated the task description. (Show Details)
Mholloway triaged this task as Normal priority.Sat, Aug 31, 5:28 PM

This doesn't need to hold up technology reviews, but we will want it in place before launch.

Change 533978 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/extensions/MachineVision@master] Add depicts statement when a MAD suggestion is accepted

https://gerrit.wikimedia.org/r/533978

Change 533978 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Add depicts statement when a MAD suggestion is accepted

https://gerrit.wikimedia.org/r/533978

This comment was removed by mpopov.
mpopov added a comment.EditedTue, Sep 10, 9:13 PM

I can add a machine-aided depicts tag (or similar) to every revision adding a depicts statement, as a start.

That will be a great first start, thanks! Paging @mpopov for additional input/ideas :)

Thank you, Michael! The great news is that an edit tag gets us the second criterion too since it's easy to query edits and their revert status from the mediawiki history monthly snapshots in our data lake. (Calculating revert rate on a daily basis manually is possible but much more difficult.)

For "cross-referencing the tag confidence scores" I'm curious if those are currently stored anywhere. Or is the task to figure out that storage?

For client-side analytics instrumentation, Jason and I are working on a set of cross-platform libraries for standardized and unified product analytics that would simplify this, BUT we won't have production-quality stuff until end of Q2 or Q3. Until then, MachineAidedDepictsUsage is a potential schema we can use and instrument for. It's a relatively simple impression/click-style design that would track:

  • name of service used
  • the confidence score
  • how long the suggestion took

for action = "receive" events. What the user does with the received suggestions is tracked in follow-up "confirm"/"reject" events which don't need to include that information.

I'm starting vacation tomorrow so in the meantime I'd love to get a review of that design from someone on Product Analytics team and also for @Mholloway to let me know whether the tag_id and page_id make sense/are doable. I'm not sure how much logging is going to happen on the backend that's going to act as the mediator between MediaWiki and external services like Google Vision API, so tag_id might have to be replaced with a randomly generated identifier. As for page_id, that's more of a performance concern because it's easier to send an integer than a filename string, but let me know if that would be problematic.

@Ramsey-WMF: Once the schema design has been given a thumbs up, who would be instrumenting it? I don't know that I'm able to work on this in an engineering capacity.

@mpopov Thanks for drafting a schema! Some thoughts:

  • page_id should be fine.
  • tag_id — it seems like you have in mind here either a row ID or some other arbitrary unique identifier, but since page_id is required, maybe we could just use the Wikidata ID string here? (Wikidata ID + image SHA1 digest uniquely identify a label suggestion in the underlying table.)
  • wait_time — is this the time elapsed between submitting the labeling request(s) and receiving a response, or the time between the user opening Special:SuggestedTags and the tags being loaded from the DB and presented? The former may or may not be interesting if we ultimately use Google as our MV provider, since we'll be planning to use batched, asynchronous labeling requests.
  • We'll also be storing voting results in the DB, as well as per-provider confidence scores (see here). Perhaps it's redundant to log these in EL? Actually, now that I look at the needs in the description more closely, reversion is the only piece that isn't accounted for in the DB schema.

As for your last question, I'm guessing the implementation will fall either to Product Infra or to the Structured Data PHP devs.

Taking this off kanban until the final schema is confirmed.