Page MenuHomePhabricator

Add Link engineering: Provide a mechanism for recording credit to a user if they review all link recommendations with "no" or "skip"
Closed, ResolvedPublic

Description

This one relates to T266446: Add Link engineering: Provide a mechanism for storing data about which link recommendations were rejected by the user in that T266446 also is concerned with what happens to "no" or "skip" actions.

Unlike T266446, here we want to specifically account for cases where a user might review (just "reject"? or also "skip" actions?) all link recommendations for an article. The user hasn't made an edit, but we will want to record this as a contribution on the user's impact module.

Some ideas:

  • store some structured data in a user preference.
  • retain the rows in the MySQL table (never delete) used for caching link recommendations, and have a new table where we store information about what actions a user took for a particular set of recommendations
  • create a log type and log to Special:Log. (The UX would have to be clear to the user that even though they didn't make an edit, their activity would still be visible publicly to other users.)

Note: from the design perspective, we are in favor of this information being public. We'll need to include some kind of disclosure in the user experience telling the user that their "no" answers will be available publicly for others to see.

Event Timeline

User preferences are poorly suited for large data, and this would be fairly large for an active user (unless we only keep a counter). Special:Log is a fairly low effort approach that provides transparency (might be a good or a bad thing) and can be used reasonably efficiently to tally all contributions of a user. It is not efficient and providing per-page information (what review did the user do on a given page?) as the logging table has no user+page index, but we might never reach the level of per-user edit counts where that would matter.

A custom table is slightly more work but more flexible - if we don't have an explicit goal of surfacing the user's work to other community members, I would go with that. Something like (user_id, page_id, revision_id, JSON hash of (recommendation => user action)) - I'm not sure preserving the cache table rows would make things simpler.

User preferences are poorly suited for large data, and this would be fairly large for an active user (unless we only keep a counter).

This wouldn't be large data though. It would only be the cases where a user clicked "No" on each instance of a recommended link. I can't imagine that would happen that often for any given user, if it is, there is a bigger problem happening.

That said...

Special:Log is a fairly low effort approach that provides transparency (might be a good or a bad thing) and can be used reasonably efficiently to tally all contributions of a user. It is not efficient and providing per-page information (what review did the user do on a given page?) as the logging table has no user+page index, but we might never reach the level of per-user edit counts where that would matter.

It sounds like this is a better approach from the product perspective in that we do want to surface the activity publicly in a similar way to how editing does.

It's potentially infinite data in the sense that it can only grow and there's no hard limit. That's an easy source of performance problems, and user options are performance-sensitive, being loaded for and bundled with every pageview.

Wrt transparency, do we care about exposing the details (ie. should others being able to inspect that the user rejected which word in which article)? Logs can only show limited information about what happened (even if we they can store all the information).

We should probably double-check with DB-ops before putting a significant amount of data into log_params (currently it's usually just an ID or two) - I don't expect trouble there but it's a very large table so it doesn't harm to be cautious. Using it would also mean some of the data is on x1 and some in the normal cluster (so transactions won't work reliably) - again, I wouldn't expect any problems from that.

I think my biggest concern with using the log table is the indexing issue mentioned above. If we need to check what rejection information we are storing for a given article, the indexes aren't great for that. Of course we could always duplicate the data elsewhere and use the log table strictly as a transparence measure - that in itself doesn't seem problematic.

Change 653651 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Add Link: API endpoint for submitting the user's choices

https://gerrit.wikimedia.org/r/653651

Change 653652 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Add a log entry when recommendations are reviewed

https://gerrit.wikimedia.org/r/653652

@MMiller_WMF could you please clarify if we should also include "skip" interactions in the count of reviewed items that we record in the log message? The patch currently uses a sum of accepted + rejected links, but does not include skipped links.

FWIW, the log message is <user> reviewed <count> link recommendations for <page>. (It could provide more detail, but it didn't seem valuable since the actual link recommendations are private.)

@kostajh @Tgr -- reading back over this task, it sounds like it would be helpful to have some more specific requirements for what sorts of things we will want to do with the data on "no" and "skip" responses. I'm writing these capabilities out in example statements:

  • You have reviewed 56 articles, in which you have accepted 30 suggestions, skipped 20 suggestions, and rejected 6 suggestions. This has resulted in edits to 49 articles (because in 7 of them, you didn't accept any suggestions).
  • The 49 articles you edited have been viewed 100,000 times in the last 30 days.

Is this helpful? If so, I can include it in the task description.

Regarding your question above: I wonder if it would be least confusing to add a log entry for every review, even if the user accepted the suggestions. We do want to include the "skip" interactions. So perhaps the log message could be like: <user> reviewed <count> link recommendations for <page>: <count> accepted, <count> rejected, and <count> skipped.

What do you think of that?

IMO a separate accept/reject/skip count would not be that useful (logs are mainly for exposing activity by a given user / related to a given page to other editors; for exposing it to the user themselves we'll have the impact module; and other users can't see the actual recommendation so details about it don't convey much meaningful information) and it's annoying to translate (we don't want to say "0 accepted" so we'd have to have seven different messages with all combinations of the various parts missing). Maybe mildly useful to detect problematic behavior that does not result in edits (such as a user rejecting lots of recommendations without ever accepting any) but doesn't seem too likely to happen.

The example statements are helpful, but the log entries will not be useful for that, and the database table used for T266446: Add Link engineering: Provide a mechanism for storing data about which link recommendations were rejected by the user already covers them.

While I think that we should count and "give credit" to rejections, "skips" seem a lot less useful to count, even for users themselves.
One consideration that comes to mind though is whether by logging all decisions types, we can view the different percentages of acceptance/rejections across wikis and use that to monitor algorithm performance?

While I think that we should count and "give credit" to rejections, "skips" seem a lot less useful to count, even for users themselves.

Ah, OK I might have misunderstood, as I thought we were going to include "skips" in the Impact module along with "rejects".

One consideration that comes to mind though is whether by logging all decisions types, we can view the different percentages of acceptance/rejections across wikis and use that to monitor algorithm performance?

We will also be recording this data via event logging, and my sense is that this is probably the best place to run queries to assess algorithm performance because we'll have access to all the other instrumentation data as well.

Change 653651 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add Link: API endpoint for submitting the user's choices

https://gerrit.wikimedia.org/r/653651

Change 653652 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add a log entry when recommendations are reviewed

https://gerrit.wikimedia.org/r/653652

Change 654034 had a related patch set uploaded (by Kosta Harlan; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Add Link: Store user reviews of link recommendations

https://gerrit.wikimedia.org/r/654034

Change 654034 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add Link: Store user reviews of link recommendations

https://gerrit.wikimedia.org/r/654034

@kostajh @RHo @Tgr -- I agree that we do not see very many uses for exposing "skip" counts to users. But I have an instinct that if it's trivial to add, we should give ourselves the capability to count and expose it to users. We might want it, and I would prefer to have the option.

@Tgr -- I still think we should have the more detailed log message, because it won't be that much longer and it will be more useful in ways that we may not be able to predict right now. I think the use case you brought up, around seeing if a given user is behaving in a careless and vandalous way, is a good one. Regarding the translations, I do think it would be fine to say things like "0 accepted", and therefore it would require only one set of messages to translate. Although it would not be completely concise, each log message would have the exact same format that way.

What do you think of all this?

That sounds reasonable.

Also, on reflection, at least storing all three numbers in the log record is the future-proof option, even if we wouldn't display it right now. The message can be changed any time, the data stored in old log events would be very hard to tamper with.

Change 656532 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Show accepted/rejected/skipped count for addlink log entries

https://gerrit.wikimedia.org/r/656532

Change 656532 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Show accepted/rejected/skipped count for addlink log entries

https://gerrit.wikimedia.org/r/656532