Page MenuHomePhabricator

Measure the impact of externally-originated contributions
Closed, ResolvedPublic

Description

When a page is delivered through an external translation service we want to provide paths for users to still be able to contribute (T212300), and we want to have a clear understanding about how these affect users and the content they produce.

In order to support this, the following aspects will be measured:

Access to contribution

  • Reading to contribution funnel. We want to measure how many people move through the workflow we are providing from reading to contribution: access the translated page → access the contribution options page → access the local/original article to contribute → complete a contribution. Capturing this as both the number of users, and the percentages of those that move/drop-off on each stage will provide a good idea of how users move through the process.
  • Comparison with local workflows. To better understand the above it would be useful to compare these numbers with the standard contribution workflow on regular articles. In particular, which percentage of readers access the edit action, and which percentage of those make a contribution. This will allow to understand whether users coming from an external automatic translation are more or less likely to try to contribute and succeed to complete such contribution.

We may want to have this analysis both, per specific wiki as well as an aggregated perspective for all wikis.

Content produced

  • Content created. How many edits and pages were created as a result of people coming from an externally translated page. This provides an idea of the volume of content that is generated when coming from an external automatic translation. A revision tag (T209132) is available to identify the contents created in this way.
    • Comparison with local wiki. Comparing the above numbers with the overall number of pages/edits created in the local wiki, will help to understand which percentage of the total contributions are originated from an external automatic translation
  • Content survival Checking whether contributions that originated from an external automatic translation have been reverted or not provides an idea of the quality of those contributions.
    • Comparison with local wikiComparing the above numbers with the usual revert/deletion rates for the local wiki will allow to understand whether users coming from an external automatic translation are more or less likely to meet the community quality standards with their contributions.

We may want to have this analysis both, per specific wiki as well as an aggregated perspective for all wikis.


Results available in this report.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@kzimmerman adding you here for visibility. We're still getting the task finalized, but this is how things are shaping up for Toledo

cc/ @dr0ptp4kt

Thanks @atgo ! Tagging with product analytics so we can start incorporating this into our planning in the new year.

atgo added a comment.Jan 3 2019, 7:32 PM

@pau @dr0ptp4kt want to make sure that readership of translated results is also tracked and reported. I think that's represented in the "contribution funnel" piece, but want to call it out and am making some minor adjustments accordingly.

atgo updated the task description. (Show Details)Jan 3 2019, 7:33 PM

@pau @dr0ptp4kt want to make sure that readership of translated results is also tracked and reported. I think that's represented in the "contribution funnel" piece, but want to call it out and am making some minor adjustments accordingly.

Great. Making it more explicit makes sense, and the updated description looks good. Thanks!

@pau @dr0ptp4kt want to make sure that readership of translated results is also tracked and reported. I think that's represented in the "contribution funnel" piece, but want to call it out and am making some minor adjustments accordingly.

On that matter, we should consider making use of the already existing "virtual pageview" framework to track readership of these translated results. This would have various benefits, e.g. making stats for this new way of reading Wikipedia content directly comparable with our data for normal pageviews, in various dimensions.

@Tbayer what do you have in mind? Heads up, T208795 captures the first concrete case where the full transcoding indeed goes all the way through the Wikimedia servers and stuff is already counted as a pageview but there's an X-Analytics key-value made available for query purposes.

Tbayer added a comment.EditedJan 9 2019, 1:24 AM

@Tbayer what do you have in mind? Heads up, T208795 captures the first concrete case where the full transcoding indeed goes all the way through the Wikimedia servers and stuff is already counted as a pageview but there's an X-Analytics key-value made available for query purposes.

I see, thanks! Having that X-Analytics tag in the webrequest data is great, but that still leaves open the question how these particular requests should be processed and tallied. It seems that they are currently recorded as regular pageviews, without any possibility (after the data has been aggregated in the pageview_hourly table and the source webrequest data has expired) to distinguish these Google-translated views from normal pageviews where the page is read in the original language. We should discuss whether that's really what we want from a product analytics perspective. Again, an alternative proposal would be to register (and aggregate) them as a virtual pageview instead, using the existing Hive table - i.e. a new way of reading our content (just like page previews was). Or we could add a field to the pageview_hourly table distinguishing translated from regular views. Happy to follow up elsewhere on the details and tradeoffs.

Claiming for now until I have identified an analyst to work on the details. I will attend meetings/discussions and track in the meantime

kzimmerman moved this task from Triage to Backlog on the Product-Analytics board.Jan 10 2019, 9:07 PM

Change 483681 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ExternalGuidance@master] Add analytics trackers

https://gerrit.wikimedia.org/r/483681

4 events are tracked as counters with the keys given below:

In https://gerrit.wikimedia.org/r/483681 following events are emitted with counters

Event keyContext
MediaWiki.ExternalGuidance.init.serviceName.fromLang.toLangEmitted when the external context is detected.
MediaWiki.ExternalGuidance.specialpage.serviceName.fromLang.toLangEmitted when the special page is visited from the contribute link.
MediaWiki.ExternalGuidance.createpage.serviceName.fromLang.toLangEmitted when the create page button is clicked on the specialpage.
MediaWiki.ExternalGuidance.mtinfo.serviceName.fromLang.toLangEmitted when the service information overlay is accessed.

Change 483681 merged by jenkins-bot:
[mediawiki/extensions/ExternalGuidance@master] Add analytics trackers

https://gerrit.wikimedia.org/r/483681

atgo added a comment.Jan 18 2019, 9:43 PM

Hey y'all. @chelsyx is going to help us on the analysis side. She's just getting up to speed and may have some changes, but this looks good at a first pass.

Hi @santhosh , I have a couple of questions about the counters in T212414#4872124:

1, I want to double check whether my understanding about the keys are correct:

  • MediaWiki.ExternalGuidance.init.serviceName.fromLang.toLang is emitted when a page is requested by the translation service, correct?
  • MediaWiki.ExternalGuidance.mtinfo.serviceName.fromLang.toLang is emitted when user clicks on the "Automatic translation" button (T212329), correct?
  • When MediaWiki.ExternalGuidance.createpage.serviceName.fromLang.toLang is emitted, how do we distinguish whether user clicks to contribute in local language or in original language?

2, Where are we going to save these events? In the public mediawiki databases?

  • MediaWiki.ExternalGuidance.init.serviceName.fromLang.toLang is emitted when a page is requested by the translation service, correct?

No. It is emitted when our code detect that the page is presented to a user by an external service(Also known as External context detection). At this point we do our banner injection. If this event is emitted it means a user saw a page from fromLang wikipedia translated to toLang in an external context like Google Translate

  • MediaWiki.ExternalGuidance.mtinfo.serviceName.fromLang.toLang is emitted when user clicks on the "Automatic translation" button (T212329), correct?

Yes.

  • When MediaWiki.ExternalGuidance.createpage.serviceName.fromLang.toLang is emitted, how do we distinguish whether user clicks to contribute in local language or in original language?

I have not added any event for 'contributing to original language'. That is a good catch. Will add one for that.

2, Where are we going to save these events? In the public mediawiki databases?

All these events are special events since they are keyed with 'counter' prefix. They go to https://wikitech.wikimedia.org/wiki/Graphite and can be monitored and analysed using dashboards and graphs at https://grafana.wikimedia.org. I have not set up a dashboard for these counters yet, but once we get events will create one.

Change 488351 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ExternalGuidance@master] Add a tracker event for editing the original source article

https://gerrit.wikimedia.org/r/488351

chelsyx added a comment.EditedFeb 6 2019, 8:16 PM
  • When MediaWiki.ExternalGuidance.createpage.serviceName.fromLang.toLang is emitted, how do we distinguish whether user clicks to contribute in local language or in original language?

I have not added any event for 'contributing to original language'. That is a good catch. Will add one for that.

Thanks @santhosh !

2, Where are we going to save these events? In the public mediawiki databases?

All these events are special events since they are keyed with 'counter' prefix. They go to https://wikitech.wikimedia.org/wiki/Graphite and can be monitored and analysed using dashboards and graphs at https://grafana.wikimedia.org. I have not set up a dashboard for these counters yet, but once we get events will create one.

To my understanding, Graphite doesn't support aggregation (e.g. by month) on the dashboard. And if I want to export data from the dashboard to compare with local workflows, the output is in JSON. This is not very convenient for analysis and reporting purposes. Can we send the event, or transform the JSON and then send them to a relational database?

To my understanding, Graphite doesn't support aggregation (e.g. by month) on the dashboard.

Aggregration is possible for any time period. Infact you have a wide set of analytical operators available to plot in graph such as sum, rate, mean, median etc.
Some examples where aggregation and fancy charting is used https://grafana.wikimedia.org/d/000000290/wikidata-query-service-ui?refresh=1m&orgId=1, https://grafana.wikimedia.org/d/000000593/service-cxserver?refresh=5m&orgId=1 and https://grafana.wikimedia.org/d/000000598/content-translation?orgId=1 You can also export data for any time duration. Eventlogging data is by default available for last 90 days, but Graphana retains data for more than a year from my experience.

Here is an example that shows aggregation per month(you can change it to any time duration)
https://grafana.wikimedia.org/d/000000593/service-cxserver?refresh=5m&orgId=1&panelId=7&fullscreen&from=1549477800000&to=1549513810698

@santhosh Thanks for the links!

The executives are asking for some type of dashboard so that they can access the following metrics in the same place:

  • Readership: The number page views from google translation service, and then compare to normal page views (T208795)
  • Contributions: Metrics about 1) access to contribution and 2) content produced (described in this ticket)

This means we need to pipe data from multiple sources to the same place: pageviews from hive table, eventlogging (EditAttemptStep table), mediawiki tables (revision table, change_tag table, etc), and the events you generate for this task. Additionally, to compute revert rate of edits, we need to use the mwreverts python package (or a complex query) to pre-process the mediawiki data. In the future, we might need to aggregate the numbers by platform, users' geolocation, etc, which requires the user agent information.

To my understanding, building the "multiple sources -> statsd -> Graphite" pipe is not a trivial effort, if it is possible--we can consult with analytics engineering. And I think eventlogging is more flexible and can accommodate these needs. In fact, we can whitelist the eventlogging table if needed, so that the data won't be purged after 90 days. Or we can just aggregate the data (so PIIs are removed) and keep the data forever.

Hello A-team! We are asked to build a dashboard and need to pipe data from multiple sources to the same place: pageviews, eventlogging, mediawiki tables--see T212414#4937357 and the ticket description for more details. Can you offer some suggestions regarding the data pipeline and the dashboarding tools?

Nuria added a subscriber: Nuria.Feb 11 2019, 5:03 PM

Seems like there are several issues here, from the requests we are not clear that you actually have that data right now to implement a pipeline, correct? Seems that this ticket is still needing instrumentation work? Not super clear on that but provided that you have all data you need seems like you need a spark job that munches data, aggregates in a way that is usable for this purpose and once that is done you could expose it via superset via loding that data into druid or in mysql. Not sure whether graphite/statsd has a place here, seems like some prior work persisted events to graphite but for multi-dimensional data graphite is really not the best approach. We can talk in more detail in a meeting if needed.

Nuria added a comment.EditedFeb 11 2019, 5:10 PM

Again, an alternative proposal would be to register (and aggregate) them as a virtual pageview instead, using the existing Hive table - i.e. a new way of reading our content (just like page previews was). Or we could add a field to the pageview_hourly table distinguishing >translated from regular views.

if a custom event is needed one can be emitted similar (or augment) to https://meta.wikimedia.org/wiki/Schema:VirtualPageView
As we discussed with popups (extensively) we will not be modifying pageview_hourly rather events can be sent with data of interest that we need to keep track of.

Change 489966 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ExternalGuidance@master] Eventlogging integration

https://gerrit.wikimedia.org/r/489966

@santhosh Thanks so much for the event logging! https://meta.wikimedia.org/w/index.php?title=Schema:ExternalGuidance&oldid=18870706

I think it would be helpful if we can add the following information to the schema. What do you think?

  • I think it would be helpful if we can distinguish whether user is create a new page, or edit existing page on the local wiki. To do that, we can 1) add a field named "type" with enum [new, existing] for all events (if possible), or 2) add an new action named "edit_existing"
  • Can we add a session_token (mw.user.sessionId() would be great) so that we can join this table with https://meta.wikimedia.org/wiki/Schema:EditAttemptStep
  • Analytics engineering suggest us to use snake_case instead of camel case for the field name, because sql/hive is case insensitive

@santhosh Thanks so much for the event logging! https://meta.wikimedia.org/w/index.php?title=Schema:ExternalGuidance&oldid=18870706
I think it would be helpful if we can add the following information to the schema. What do you think?

  • I think it would be helpful if we can distinguish whether user is create a new page, or edit existing page on the local wiki. To do that, we can 1) add a field named "type" with enum [new, existing] for all events (if possible), or 2) add an new action named "edit_existing"

"createpage" and "edit-original" value for action field should already cover them with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ExternalGuidance/+/488351

  • Can we add a session_token (mw.user.sessionId() would be great) so that we can join this table with https://meta.wikimedia.org/wiki/Schema:EditAttemptStep
  • Analytics engineering suggest us to use snake_case instead of camel case for the field name, because sql/hive is case insensitive

Done. See https://meta.wikimedia.org/w/index.php?title=Schema:ExternalGuidance&oldid=18870832. Will point the code to revision of schema

@santhosh Thanks so much for the event logging! https://meta.wikimedia.org/w/index.php?title=Schema:ExternalGuidance&oldid=18870706
I think it would be helpful if we can add the following information to the schema. What do you think?

  • I think it would be helpful if we can distinguish whether user is create a new page, or edit existing page on the local wiki. To do that, we can 1) add a field named "type" with enum [new, existing] for all events (if possible), or 2) add an new action named "edit_existing"

"createpage" and "edit-original" value for action field should already cover them with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ExternalGuidance/+/488351

"createpage" and "edit-original" distinguish contribute in local language vs in original language. What I was trying to add is for local wiki, distinguish create a new page when there is no page with the same title exist, vs expand the existing article when there exist a page with the same title, i.e. page 1 vs 2, or page 3 vs 4 in this design doc https://drive.google.com/file/d/1ua7fNGZM2n66Cr7VxOG1mNG4_B2DvfRe/view

Change 489966 merged by Santhosh:
[mediawiki/extensions/ExternalGuidance@master] Eventlogging integration

https://gerrit.wikimedia.org/r/489966

Change 488351 merged by jenkins-bot:
[mediawiki/extensions/ExternalGuidance@master] Add a tracker event for editing the original source article

https://gerrit.wikimedia.org/r/488351

Change 490281 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ExternalGuidance@master] Eventlogging: Add new action name for editing an existing page

https://gerrit.wikimedia.org/r/490281

. What I was trying to add is for local wiki, distinguish create a new page when there is no page with the same title exist, vs expand the existing article when there exist a page with the same title,

Got it. So I added editpage as action value if the page exist. 'createpage' is when page does not exist. Patch https://gerrit.wikimedia.org/r/488351 Schema change:
https://meta.wikimedia.org/w/index.php?title=Schema%3AExternalGuidance&type=revision&diff=18873656&oldid=18870832

. What I was trying to add is for local wiki, distinguish create a new page when there is no page with the same title exist, vs expand the existing article when there exist a page with the same title,

Got it. So I added editpage as action value if the page exist. 'createpage' is when page does not exist. Patch https://gerrit.wikimedia.org/r/488351 Schema change:
https://meta.wikimedia.org/w/index.php?title=Schema%3AExternalGuidance&type=revision&diff=18873656&oldid=18870832

Awesome! Thank you!

Nuria added a comment.Feb 13 2019, 7:26 PM

Summarizing discussion we had on meeting last Monday: there are two types of insights that @Pginer-WMF is asking on in this ticket, some of them are exploratory ("comparing content created on a wiki versus content created on a wiki via coming in through an external translation service), these would benefit from ad-hoc 1-off exploration of data per wiki, per language, per language pairs.. etc Others are oriented to really "measure" the population we are dealing with, that is, our userbase when it comes to this feature ("We want to measure how many people move through the workflow we are providing from reading to contribution: access the translated page → access the contribution options page → access the local/original article to contribute → complete a contribution.")

Our recommendation is to first measuring the population we are dealing with, once we have an estimate of users of feature and funnel usage we can compare outcomes of this workflow versus other workflows of contribution.

Change 490281 merged by jenkins-bot:
[mediawiki/extensions/ExternalGuidance@master] Eventlogging: Add new action name for editing an existing page

https://gerrit.wikimedia.org/r/490281

atgo reassigned this task from kzimmerman to chelsyx.Feb 15 2019, 12:51 AM

Moving to @chelsyx per @kzimmerman request.

@dr0ptp4kt and I were chatting earlier and realized that we have a gap in the analysis that we'd like to include: impact to search traffic. What is the impact of this project on search?

chelsyx moved this task from Backlog to Doing on the Product-Analytics board.Feb 15 2019, 12:59 AM

@chelsyx I still don't see a table for ExternalGuidance in db1108 log schema. As per documentation that table should get automatically created. Any idea why that is not happening?

Nuria added a comment.Feb 18 2019, 4:00 PM

@santhosh tables are created in hadoop, not mysql.

fdans moved this task from Incoming to Radar on the Analytics board.Feb 18 2019, 4:51 PM

@santhosh, as @Nuria said, eventlogging table goes to hadoop only unless we ask analytics engineering to white-list it for mysql (see T203596 for more details).

I did a quick check on the table (see the query below) and get 2273 init events and 16 mtinfo events. It doesn't seem to match the results from Grafana (https://grafana.wikimedia.org/d/MbfGSLXmz/externalguidance?orgId=1&from=now-7d&to=now). Any idea?

select event.action, count(*)
from event.externalguidance
where year=2019 and month=2
group by event.action;
Nuria added a comment.Feb 19 2019, 1:50 AM

@chelsyx you can also see errors like: select * from eventerror where event.schema like'%external%' and year=2019 and month=2 ; but as of now there aren't any.

@santhosh tables are created in hadoop, not mysql.

Sorry, did not notice that change. Could you please help updating the documentaiton. It says "Deployed schemas will automatically create a MySQL table (named SchemaName_versionNumber) for the collected data on the data store. "

I did a quick check on the table (see the query below) and get 2273 init events and 16 mtinfo events. It doesn't seem to match the results from Grafana (https://grafana.wikimedia.org/d/MbfGSLXmz/externalguidance?orgId=1&from=now-7d&to=now). Any idea?

I am not sure. I see about 1600 init events in grafana. But I doubt the accuracy of grafana stats since I remember seeing more number for en.hi and en.id pairs yesterday for same time period. Is there a problem of accuracy for grafana?

Nuria added a comment.Feb 19 2019, 5:27 PM

@santosh, where at? our docs are in wikitech and seem up to date: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging#MariaDB

Nuria added a comment.Feb 19 2019, 5:38 PM

@santhosh I see, that was on the mediawiki.org doc which describe the frontend of the EL extension well, which is generic. The backend however is WMF specific and thus described on wikitech docs. Corrected some (very outdated references) to MYSQL on mediawiki.org.

I did a quick check on the table (see the query below) and get 2273 init events and 16 mtinfo events. It doesn't seem to match the results from Grafana (https://grafana.wikimedia.org/d/MbfGSLXmz/externalguidance?orgId=1&from=now-7d&to=now). Any idea?

I am not sure. I see about 1600 init events in grafana. But I doubt the accuracy of grafana stats since I remember seeing more number for en.hi and en.id pairs yesterday for same time period. Is there a problem of accuracy for grafana?

@santhosh I'm not sure, maybe that has something to do with the non-integer counts on the Grafana dashboard? @Nuria do you know why some of the counts on this dashboard is non-integer? https://grafana.wikimedia.org/d/MbfGSLXmz/externalguidance?orgId=1&from=now-7d&to=now

And I was trying to see whether the event I initiated was recorded--this is the link of my test: https://translate.google.com/translate?sl=en&tl=id&u=https%3A%2F%2Fsimple.wikipedia.org%2Fwiki%2FPowderfinger
but I cannot find this record in eventlogging table, nor in Grafana (init event that match the languages and the time I triggered). @santhosh maybe I did the test in a wrong way? And you didn't implement any sampling in the eventlogging or grafana, right?

Additionally, I found the button that leads users to edit in the original wiki doesn't work--the "Sunting versi asli dalam English" button on https://id.wikipedia.org/w/index.php?title=Special:ExternalGuidance&from=en&to=id&page=Powderfinger&service=Google

And I was trying to see whether the event I initiated was recorded--this is the link of my test: https://translate.google.com/translate?sl=en&tl=id&u=https%3A%2F%2Fsimple.wikipedia.org%2Fwiki%2FPowderfinger
but I cannot find this record in eventlogging table, nor in Grafana (init event that match the languages and the time I triggered). @santhosh maybe I did the test in a wrong way? And you didn't implement any sampling in the eventlogging or grafana, right?

You are testing it in correct way. Perhaps there is delay in getting that data in to tables?

Additionally, I found the button that leads users to edit in the original wiki doesn't work--the "Sunting versi asli dalam English" button on https://id.wikipedia.org/w/index.php?title=Special:ExternalGuidance&from=en&to=id&page=Powderfinger&service=Google

That is a known issue we have already fixed. Going with this weeks train

Nuria added a comment.Feb 20 2019, 5:49 AM

The graphana dashboard has a lot of problems, if you export the data to csv you can see is mostly nulls

You are testing it in correct way. Perhaps there is delay in getting that data in to tables?

Hmm... I will check again tomorrow.

@santhosh I did some additional test on my iPhone and desktop using Charles proxy and saw events got sent to both statsv and eventlogging beacon. The schema revision shown is 18870832, we should probably change it to 18873656, although I don't think it's the reason why we didn't see my test events in the table. Also I only saw init and mtinfo, but no specialpage and createpage although the button works.

So if I still can't see the events in the table tomorrow, that probably means that there may be something wrong in the backend. I checked the eventerror table using the query below, but only saw some errors saying that the schema revision is -1 (invalid) -- I guess that's a fixed issue.

select * from eventerror 
where event.schema like'%ExternalGuidance%' 
and year=2019 and month=2 and day > 13

Update: I checked the table in hadoop again and found all of my testing events in there, but some of them doesn't show up in the Grafana dashboard. (To be more specific, I got all of my testing events using the query below and tried to find matched events in Grafana through the timestamp, language, and action type.) Also based on what Nuria said in T212414#4967609, I think we should trust the eventlogging table more than the Grafana dashboard.

select dt
from event.externalguidance
where year=2019 and month=2 and day>=19
and event.title="Powderfinger"
order by dt
limit 1000;
-----------
dt
2019-02-20T04:31:58Z
2019-02-20T04:33:24Z
2019-02-20T04:33:35Z
2019-02-20T04:46:15Z
2019-02-20T04:46:21Z
2019-02-20T07:04:06Z
2019-02-20T07:04:08Z
2019-02-20T07:04:33Z
2019-02-20T07:04:44Z
2019-02-20T07:07:21Z
2019-02-20T07:08:58Z
2019-02-20T07:09:11Z
2019-02-20T07:14:29Z

@santhosh If you can help change the schema revision from 18870832 to 18873656, and fix the events when action is specialpage, createpage, editpage, or edit-original--currently these events didn't get sent when corresponding actions were triggered, I think we are good to go! Thanks!

Nuria added a comment.Feb 21 2019, 4:23 AM

@santhosh Let's please clean the code that sends data to graphana if we think we will no longer use it, in the state the data is now is hardly of use. Graphana is a great tool for data with 1 dimension (example:cpu usage over time) , for multidimensional data like these counts (actions per language pair) it does not work that well.

@santhosh If you can help change the schema revision from 18870832 to 18873656,

The master version of code has schema as 18873656, and deployed to simple and en.wiki on 21st along with the regular deployments. So that should be resolved. That deployment also has fixes for the events not captured as a side effect of other bugs. @chelsyx Please see if that updates are seen in the logged data.

@Nuria, yes we can remove that events. as we get the accurate data from eventlogging.

I did another round of test and confirm that the eventlogging works as we expected. Thanks @santhosh !

chelsyx added a comment.EditedMar 4 2019, 6:01 AM

Here is the link to the temporary dashboard/report: https://analytics.wikimedia.org/datasets/external-automatic-translation/impact%20of%20external%20automatic%20translation%20services.html
I'm planning to update it daily. It's not a finished work, but I think it may be helpful to you. Please let me know if you have any comments/suggestions, or any privacy concern since this is a public page.

TODOs for @chelsyx:

  • Add corresponding metrics using idwiki data for comparison
  • Add the count of completed edits from auto-translation, currently blocked by T216123
  • Still need to figure out a way to publish the notebook automatically from SWAP

Hello @santhosh , I think it would be helpful if we can add a field (invoke_source) to the ExternalGuidance schema, to distinguish whether a user came from the automatic translation on search result page (the query string of the referrer url contains client=srp). Is this doable? And how long will it take to deploy this change? If it delays the deploy on enwiki too much, let's leave it for the future.
Also @Pginer-WMF , do you think this is necessary?

Here is the link to the temporary dashboard/report: https://people.wikimedia.org/~chelsyx/Toledo.html

Great. Thanks, @chelsyx. This looks good. Fantastic start.

Hello @santhosh , I think it would be helpful if we can add a field (invoke_source) to the ExternalGuidance schema, to distinguish whether a user came from the automatic translation on search result page (the query string of the referrer url contains client=srp). Is this doable? And how long will it take to deploy this change? If it delays the deploy on enwiki too much, let's leave it for the future.
Also @Pginer-WMF , do you think this is necessary?

If I understand this correctly, the proposal is to be able to distinguish between (a) the users that access an automatic translation that was directly exposed in search results, and (b) the users that access a translation they explicitly requested by going to Google Translate and pasting a Wikipedia link there.

I think this distinction is useful and will provide more details on the user behaviour, so I think it is worth supporting. However, I'd not consider it a blocker for the next deployment. The current numbers, without distinguishing yet between (a) and (b), provide a good-enough initial perspective of the general behaviour, that we can iterate and improve.

Also, in the above classification, I'm curious to know where would be those users that clicked on the "translate this page" action in a search results. I'd expect to be part of (b) because of the "explicit" action to translate (although their origin is the search results page, as it is for (a)).

dr0ptp4kt added a comment.EditedMar 4 2019, 6:36 PM

@chelsyx As it is the "Access the translated funnel" line makes the other parts of the funnel look compressed in the "Number of events when target language is Indonesian, by action type" graph. It's a true representation of the magnitude, of course, but I was wondering if you had an approach that might aid visual interpretation of the data (e.g., two y-axes, non-constant scale, percentile fluctuation, etc.).

@dr0ptp4kt and @atgo , regarding T212414#4956321, you want to track the impact of this project on google search, i.e. how often our pages show up in google search results, their average position, etc, not the impact on the internal search on wikipedia, correct?

atgo added a comment.Mar 4 2019, 8:31 PM

@chelsyx I'm thinking specifically about our traffic from Google search. So search results > clicks > impact on overall traffic. In particular, does Toledo traffic cannibalize other results?

@chelsyx As it is the "Access the translated funnel" line makes the other parts of the funnel look compressed in the "Number of events when target language is Indonesian, by action type" graph. It's a true representation of the magnitude, of course, but I was wondering if you had an approach that might aid visual interpretation of the data (e.g., two y-axes, non-constant scale, percentile fluctuation, etc.).

Thanks for the comment @dr0ptp4kt ! I updated those graph to use log scale, and those small values should show up now. Also if you hover over the graph, the exact value of each data point would show up in the tooltip.

@chelsyx I'm thinking specifically about our traffic from Google search. So search results > clicks > impact on overall traffic. In particular, does Toledo traffic cannibalize other results?

Ok. I will add search engine referred pageviews on idwiki and enwiki to the report. For externally referred pageviews on all wikis, we have https://discovery.wmflabs.org/external/ .

If I understand this correctly, the proposal is to be able to distinguish between (a) the users that access an automatic translation that was directly exposed in search results, and (b) the users that access a translation they explicitly requested by going to Google Translate and pasting a Wikipedia link there.

That's correct.

I think this distinction is useful and will provide more details on the user behaviour, so I think it is worth supporting. However, I'd not consider it a blocker for the next deployment. The current numbers, without distinguishing yet between (a) and (b), provide a good-enough initial perspective of the general behaviour, that we can iterate and improve.

Sounds good. Let's leave this for the future.

Also, in the above classification, I'm curious to know where would be those users that clicked on the "translate this page" action in a search results. I'd expect to be part of (b) because of the "explicit" action to translate (although their origin is the search results page, as it is for (a)).

@Pginer-WMF you mean clicking on the "translate this page" link like the one shown in the below screenshot, right? Yes, you're right, this action is part of (b) -- the query string of translated page url doesn't contain client=srp.

@Pginer-WMF you mean clicking on the "translate this page" link like the one shown in the below screenshot, right? Yes, you're right, this action is part of (b) -- the query string of translated page url doesn't contain client=srp.

Yes, that one. Ok, thanks for the clarification!

Update: The url to the report is now changed to https://analytics.wikimedia.org/datasets/external-automatic-translation/impact%20of%20external%20automatic%20translation%20services.html for easier publishing. It is updating daily at 2AM UTC (the page will get updated with a few minutes delay).

Done. @Pginer-WMF can you review the report and close this ticket if everything we want is the report?

Thanks @chelsyx, this looks good!
I went through the report in more detail and wanted to share my initial interpretations to make sure I'm reading the report as intended. Please confirm this is a valid interpretation or correct any aspect if needed:

  • Based on 1A, the volume of visits coming from integrated translated results in Google search is similar to the visits explicitly requesting a translation (e.g., by going to Google Translate or clicking the "translate this age" option in the search results).
  • Based on 1C, seems that views to Indonesian translations are a small percentage (<1%) compared to the visitors that regular Indonesian wikipedia pages get. Thus, most of the Indonesian audience is getting regular community-created Wikipedia articles and not translations.
  • Based on 2, seems that most people just consume content (as we expected) with less than 0.2% of visits interested in contributing or learning more about translation.
  • Based on 2A, seems that ~14% of visits entering the path t contribute pick one option to do so, but less than 2% complete such edit. This is a much lower conversion than for usual edits on regular Wikipedia pages which is in the 30-40% range (as shown in 3A). Given that the task is the same in both cases, I wonder if the difference in the result is related to (a) translated Google search results are only exposed on mobile (do we know which is the general edit completion ratio on mobile?), (b) audience with a different level of editing expertise, or (c) something else.
  • Based on 4, very few edits have been made through translations (13 edits, including 5 test edits by the team).

Thanks @chelsyx, this looks good!
I went through the report in more detail and wanted to share my initial interpretations to make sure I'm reading the report as intended. Please confirm this is a valid interpretation or correct any aspect if needed:

  • Based on 1A, the volume of visits coming from integrated translated results in Google search is similar to the visits explicitly requesting a translation (e.g., by going to Google Translate or clicking the "translate this age" option in the search results).

Correct. And note that this graph only shows the traffic when the translation target language is Indonesian. You may also notice that from January to February, visits from integrated translated results is much higher, then it drop since March. I asked @dr0ptp4kt and he thinks Google probably change their algorithm.

  • Based on 1C, seems that views to Indonesian translations are a small percentage (<1%) compared to the visitors that regular Indonesian wikipedia pages get. Thus, most of the Indonesian audience is getting regular community-created Wikipedia articles and not translations.

Correct. Also note that in 1C, the views to Indonesian translations only include visits from integrated translated results -- it doesn't include the visits explicitly requesting a translation.

  • Based on 2, seems that most people just consume content (as we expected) with less than 0.2% of visits interested in contributing or learning more about translation.

Correct. Visits interested in learning more about translation is slightly higher (>0.2%), but still very little.

  • Based on 2A, seems that ~14% of visits entering the path t contribute pick one option to do so, but less than 2% complete such edit. This is a much lower conversion than for usual edits on regular Wikipedia pages which is in the 30-40% range (as shown in 3A). Given that the task is the same in both cases, I wonder if the difference in the result is related to (a) translated Google search results are only exposed on mobile (do we know which is the general edit completion ratio on mobile?), (b) audience with a different level of editing expertise, or (c) something else.

The 30-40% conversion rate from start editing to save an edit successfully shown in the first graph in 3A is a rate on desktop. If you look at the second graph in 3A, which is the conversion rate on mobile, it's only around 4% (closer to the ratio in our external guidance funnel). So yes you are right, mobile is the most important factor in the difference.

The other source of the difference may be the way we identify a user is "start" editing. In external guidance funnel (2A), we count every click on either of the edit button as an edit "start"; In the regular wikipedia funnel (3A) however, we say this is an edit "start" only when the editor is ready for user input (when the text is fully loaded to the editor and cursor start blinking). This means the denominator in the calculation for regular wikipedia is smaller because some users may drop between they click on the edit button and the editor is ready (but we expect the number of this kind of users should be very small). I chose a different way to identify editing start in regular wikipedia because Neil told me they found a lot of "click" to start edit in regular Wikipedia VE which seems to be from undetected bot, and count when the editor is ready should eliminate a lot of those bot behavior.

  • Based on 4, very few edits have been made through translations (13 edits, including 5 test edits by the team).

Yes, unfortunately.

  • Based on 1A, the volume of visits coming from integrated translated results in Google search is similar to the visits explicitly requesting a translation (e.g., by going to Google Translate or clicking the "translate this age" option in the search results).

Correct. And note that this graph only shows the traffic when the translation target language is Indonesian. You may also notice that from January to February, visits from integrated translated results is much higher, then it drop since March. I asked @dr0ptp4kt and he thinks Google probably change their algorithm.

The other hypothesis is that there was a change in the interests (e.g., trending regional interest in topics waned) of users in topics or something in the content itself changed (e.g., new or improved articles on popular topics) that resulted in a change in traffic patterns. It could be all of these things.

Pginer-WMF closed this task as Resolved.Mar 26 2019, 4:06 AM
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF moved this task from In Review to Done on the ExternalGuidance board.

Thanks for the extra details, @chelsyx and @dr0ptp4kt.
Marking the ticket as resolved!