Page MenuHomePhabricator

Workable CSV for Data need: Explore range of article revision comparisons
Closed, ResolvedPublic5 Story Points

Description

T134861: Data need: Explore range of article revision comparisons was about logging data.

This task is about creating a csv file out of the log data that the UX-team can use to make sense of it.

This is what needs to be included in the file:
For the revision view as is (without the new revision slider):

[] date and position of older revision (as in "the nth revision")
[] date and position of younger revision
[] total number of revisions of this article
[] maybe: article id

Background: This is an investigation task related to a wish from the German Community wishlist: https://de.wikipedia.org/wiki/Wikipedia:Umfragen/Technische_W%C3%BCnsche_2015/Topw%C3%BCnsche#Anzeige_aller_Bearbeitungskommentare_im_Diff_.5BUmfrage_2015.5D

Details

Related Gerrit Patches:
mediawiki/extensions/WikimediaEvents : masterRemove dewiki_diffstats logging
mediawiki/extensions/WikimediaEvents : masterdewiki_diffstats add rev timestamps & feature state
operations/mediawiki-config : masterRemove dewiki_diffstats logging
mediawiki/extensions/WikimediaEvents : wmf/1.28.0-wmf.11dewiki_diffstats add rev timestamps & feature state

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 19 2016, 4:15 PM
Lea_WMDE renamed this task from Workable CSV for Data need: User Behaviour when comparing article revisions to Workable CSV for Data need: Explore range of article revision comparisons .Jun 6 2016, 3:48 PM
Lea_WMDE updated the task description. (Show Details)

Data should start appearing for this as of this evening.

Addshore moved this task from Unsorted 💣 to Next on the User-Addshore board.Jul 25 2016, 5:43 PM
Addshore moved this task from Maintenance to Doing on the Revision-Slider board.Jul 25 2016, 8:22 PM
Addshore triaged this task as Medium priority.Jul 26 2016, 9:09 AM
Addshore moved this task from Proposed to Doing on the TCB-Team-Sprint-2016-07-14 board.
Addshore moved this task from Next to Back Burner 🏛️ on the User-Addshore board.
Addshore set the point value for this task to 5.
Addshore added a comment.EditedJul 26 2016, 11:39 AM

For the revision view as is (without the new revision slider):

This was not specified in the previous task and thus the data currently being collected is for ALL views to the diff page.
@Jan_Dittrich @Lea_WMDE does this need to change?

Also there was no request for the dates of the revisions being compared.
I will make another patch and wait for it to be deployed before making any CSV.

Hmm, it would be good to distinguish between them (Existence of the revisions slider would be a confounding variable), but even without the data is useful.

Change 301110 had a related patch set uploaded (by Addshore):
dewiki_diffstats add rev timestamps & feature state

https://gerrit.wikimedia.org/r/301110

Change 301110 merged by jenkins-bot:
dewiki_diffstats add rev timestamps & feature state

https://gerrit.wikimedia.org/r/301110

Change 301119 had a related patch set uploaded (by Addshore):
dewiki_diffstats add rev timestamps & feature state

https://gerrit.wikimedia.org/r/301119

Change 301119 merged by jenkins-bot:
dewiki_diffstats add rev timestamps & feature state

https://gerrit.wikimedia.org/r/301119

So I have made a script @ https://github.com/addshore/dewiki_diffstats which can be used to turn the log files into a CSV with all of the data requested in it.

Links to the code and an explanation of the data in the resulting CSVs can be found in the README @ https://github.com/addshore/dewiki_diffstats/blob/master/README.md

The python script can be run on fluorine (as shown below)

python process.py /a/mw-log/dewiki_diffstats.log

I guess now we just need to wait for more data to appear.
@Jan_Dittrich what period of data do you want?

Addshore closed this task as Resolved.Jul 28 2016, 11:52 AM
Addshore moved this task from Doing to Done on the WMDE-Analytics-Engineering board.
Addshore moved this task from Back Burner 🏛️ to Closing ✔️ on the User-Addshore board.
Addshore moved this task from Review to Done on the TCB-Team-Sprint-2016-07-14 board.

The data I have now is sufficient, so we can stop the logging. (@Addshore )

Change 302258 had a related patch set uploaded (by Addshore):
Remove dewiki_diffstats logging

https://gerrit.wikimedia.org/r/302258

Change 302259 had a related patch set uploaded (by Addshore):
Remove dewiki_diffstats logging

https://gerrit.wikimedia.org/r/302259

Change 302259 merged by jenkins-bot:
Remove dewiki_diffstats logging

https://gerrit.wikimedia.org/r/302259

The data I have now is sufficient, so we can stop the logging. (@Addshore )

Logging disabled and there is now a patch up to remove the code that was running for the logging too.

Change 302258 abandoned by Addshore:
Remove dewiki_diffstats logging

Reason:
Done in a revert in https://gerrit.wikimedia.org/r/#/c/302274

https://gerrit.wikimedia.org/r/302258

Change 302258 restored by Addshore:
Remove dewiki_diffstats logging

https://gerrit.wikimedia.org/r/302258

Addshore moved this task from Doing to Done on the Revision-Slider board.Aug 3 2016, 5:13 PM

Change 302258 merged by jenkins-bot:
Remove dewiki_diffstats logging

https://gerrit.wikimedia.org/r/302258