Page MenuHomePhabricator

Determine the pageview (and if possible, search impression) impact of automatic SERP translations
Closed, ResolvedPublic

Description

What's requested:

This is a request to determine the impact of automatic SERP translations on pageviews, as well as, if possible, search impressions. The launch of automatic inlined SERPs of translated content for Indonesian search results in Google search is the first launch of this functionality, although there will be more examples in the future.

Why it's requested:
This is requested to validate that the feature has a positive impact on pageviews (and to the degree it's measurable, impressions) and take any pertinent action or adjust when testing other interventions such as T93213: Improve access to local language wikis by fixing bug in generation of hreflang tags in <head> of article pages.

Practically speaking, the analysis approach should also be made forward compatible so that it's easier to determine impact for future launches of the feature.

When it's requested:
As soon as possible.

Other helpful information:

T212414: Measure the impact of externally-originated contributions contains a bunch of supplementary information and so do other tickets on ExternalGuidance bearing the handle @dr0ptp4kt or @chelsyx. There are multiple ways to draw inferences at the wiki level and article level (e.g., referral ratios such as in P8156), and there may be some information in the search console as well.

As a reminder, as discussed in T116678: Country mapping routine for proxied requests geocoding is not reliable if attempting to reconstruct pageview data for requests via proxies such as the Google Translate proxy; that is to say it's possible to reconstruct this with a relative degree of confidence when the full XFF data are available, but not after the current state refinement.

Generally, the automatic translation functionality is shown, and only on certain content verticals, when:

  • Only the post-translated corpus has a hit, OR
  • The native wiki language result is somehow lacking in sufficiency. This is where there's co-mingling of both the native wiki language and autotranslated SERPs. It's conceivable there's some cannibalization although in general regular search results have higher clickthrough rates so it seems unlikely the effect would be negative...but this would be interesting to understand.

When content on a given topic in the native wiki improves or the ranking strategy adjusts generally, it's conceivable that this might in turn result in what had been automatic translation flow based traffic to dry up. This shouldn't be taken as a sign of failure per se, in fact this represents closure of a content gap. There are two points here, in any case: (1) it may be necessary to join data from the editing data lake with with the pageview data lake, and (2) there's a chain of events which may have a sequencing that can be modeled.

As usual, surges on particular subjects may commonly be the source of traffic spikes on any sort of search-referred traffic, autotranslated SERP or not.

Event Timeline

Restricted Application changed the subtype of this task from "Deadline" to "Task". · View Herald TranscriptApr 30 2019, 10:50 AM
chelsyx moved this task from Triage to Next Up on the Product-Analytics board.

Hi @dr0ptp4kt , thanks for creating this ticket! I will need some time to think about how to proceed, and I may reach out to you for some clarifying questions.

Meanwhile, I've done an analysis regarding the topics of translated articles: T219660. Please let me know if you have any comments or questions, and also how it may help with this request.

dr0ptp4kt updated the task description. (Show Details)May 1 2019, 5:50 PM

Hey @dr0ptp4kt, it looks like you want to speak to specific stats around increases, and I wonder what level of precision is needed here?

Through data Chelsy provided in T212414, we see that about 0.5% of Indonesian pageviews are now coming from auto-translated results, and that the auto-translated views are comparable (or very slightly higher) than views when users initiate translation. These percentages are fairly small, and it may be that our best approach here would be to estimate a rough range of impact based on logical conclusions from the above data points, rather than attempting a more precise estimate.

dr0ptp4kt added a comment.EditedMay 15 2019, 5:36 PM

Thanks @kzimmerman. All the signals I've seen and that we could think of seem to suggest there's a positive traffic increase attributable to the intervention. My gut instinct given typical user behavior on search engine result pages and the nature of this intervention is that there's very little cannibalization, no detrimental cannibalization, and in fact there's a real boost to content availability and consumption.

An estimate of a rough range of impact based on conventional methods would be sufficient here, I think, as we just literally don't have the proxy-tagged per-article data in either of Bahasa Indonesia or English for the window preceding and following the launch, now that we're several months past the data retention window for the relevant period.

For forthcoming launches, though, it would be fantastic to understand the net impact by analysis of different article population segments (even there it's possible there can be confounding variables and complex systems effects, but I digress). Chelsy and I were discussing the other day how it might make sense to do one or more of the following in the future - would something along these lines be possible to line up for the next intervention(s)?

(1) run queries sufficiently early to determine the net impact of introduction of automated translation initiated traffic. It ought to be possible to understand the provenance of the traffic on a per-article basis for windows two months before and two months after (and narrower windows to get an early guage).

(2) determine if there's a way to automate this through the refinery pipeline that still sticks to the data retention guidelines and doesn't lead to de-anonymization.

I've heard things tossed around like maybe the current content coverage is around 40% of topics. So a 0.5% increase in total traffic above baseline figures is nontrivial along lines similar to the sameAs entity pointer change if we could project across the full English (and global) corpus. I mainly wanted to rule out any possibility of the intervention causing harm - in the nearer term context, I wanted to determine how much caution we need to exercise for the hreflang task's implementation given it could confound the benefits of automated SERP translations.

chelsyx moved this task from Next Up to Doing on the Product-Analytics board.May 30 2019, 4:56 PM

Hey @chelsyx, how is this this coming along?

Based on the discussion between @dr0ptp4kt and me, I queried the page IDs of articles translated and read by Indonesian users through Toledo between Mar 18 - Jun 14 2019 (all the existing webrequest data). If these pages have an Indonesian version (linked by the same Wikidata item), I checked their pageviews and see if there is any significant changes before and after the Toledo deployment (Dec 5 2018). We are assuming that the topics got translated during Mar 18 - Jun 14 2019 are the same as those topics got translated around Dec 5 2018, and if Toledo project cannibalized Indonesian Wikipedia, we should see a drop in pageviews for those translated topics after the deployment.

There are 371,318 unique articles got translated and read by Indonesian users through Toledo during Mar 18 - Jun 14 2019, 79,471 (21.4%) of these articles have an Indonesian version. The following graph shows the search engine referred mobile web pageviews of these pages on Indonesian Wikipedia, comparing with the pageviews of other pages on Indonesian Wikipedia. For pages that may be cannibalized by Toledo, their pageviews (blue line) dropped immediately after Toledo deployed, but increased back to the same level after the new year. We then compare it with the pageviews of other pages (red line) and see a different pattern after the Toledo deployment. But because of the topic differences, we can't draw any conclusion from this comparison.

In summary, given that the pageviews of articles translated by Toledo increased back to the same level after the holiday season, there is unlikely to be a significant cannibalization from Toledo.

@JTannerWMF and @dr0ptp4kt , sorry about the delay of this analysis. Please let me know if you have any other questions.

Thank you, @chelsyx, great work! Is there a Phabricator paste or Jupyter notebook with the queries and results for our future selves?

Thank you, @chelsyx, great work! Is there a Phabricator paste or Jupyter notebook with the queries and results for our future selves?

Yes. Because the huge size of webrequest, there are 2 steps in the analysis process:

1, Get page IDs of articles translated by Toledo to Indonesian (shell script P8620), and then save the result in chelsyx.toledo_pageid on Hive.
2, Join chelsyx.toledo_pageid with pageview data (pyspark script P8621)

kzimmerman closed this task as Resolved.Jun 25 2019, 9:29 PM

@dr0ptp4kt marking as resolved!