User Details
- User Since
- Aug 9 2016, 7:00 PM (345 w, 6 d)
- Roles
- Disabled
- LDAP User
- Chelsyx
- MediaWiki User
- CXie (WMF) [ Global Accounts ]
Jul 19 2019
Jul 18 2019
Hi @Ottomata , we are trying to transfer the ownership of Toledo notebook to @Neil_P._Quinn_WMF (see the task description for more details). One of the step is to create an oozie job to update the neilpquinn.toledo_pageviews table daily (code for the oozie job). We tried to deploy the job, but encounter an error:
Just checked the database. It all looks good, except that for preview action, a source field identifying whether users start from pencil button or highlight is missing. @NHarateh_WMF can you help fix it? Thanks!
Preview |
Jul 17 2019
Jul 16 2019
Jul 13 2019
Done. https://docs.google.com/presentation/d/1UZAsWKYwWSefFf15jyWG8bBrVXPkcb-2876Npm5_2F0/edit?usp=sharing
Waiting for review.
Jul 12 2019
Thanks so much @Nuria!
Jul 11 2019
@mforns can you please review the patch? Thanks!
@Nuria Thank you for the fix!
Update: Link to the write-up has been changed to https://analytics.wikimedia.org/datasets/ios-reports/iOS%20app%20edit%20and%20registration%20block.html
Update: The link is now changed to https://analytics.wikimedia.org/datasets/ios-reports/ios_baseline_program_metrics_fy18.html
Jul 10 2019
Thanks @Neil_P._Quinn_WMF for the review!
I've pushed the code and result to github https://github.com/wikimedia-research/2018-19-Language-annual-plan-metrics.
Jul 9 2019
@chelsyx Besides translate.googleusercontent.com is there any other third party domain sending us data?
Jul 8 2019
Jul 4 2019
Thanks so much @Neil_P._Quinn_WMF ! Please let me know if you see any issues in the query below.
(I had some problems connecting to SWAP, so these are run in separate python scripts. Let me see if I can solve the problem and update the notebook later...)
Jul 2 2019
Jul 1 2019
I hash the tokens for the following EL schemas used by the iOS app:
MobileWikiAppEdit MobileWikiAppLogin MobileWikiAppNavMenu MobileWikiAppProtectedEditAttempt MobileWikiAppSavedPages MobileWikiAppShareAFact MobileWikiAppToCInteraction
Jun 28 2019
Thanks so much @Neil_P._Quinn_WMF !
Jun 27 2019
Hi @Neil_P._Quinn_WMF and @Pginer-WMF, I want to make sure I understand the definition of the retention here correctly:
Hi @Neil_P._Quinn_WMF , this task description says that we want to measure the number of articles translated by (new) editors that aren’t deleted in 30 days. In the "Surviving translations by newcomers" section of your notebook (T199342#5290129), you're using the revision_is_deleted field to identify whether a revision is deleted. I'm wondering if this needs to be fixed to fulfill the request of 30-day survival period. Also, I'm wondering whether the revision_is_deleted field would be updated if a page is deleted in the future.
Jun 26 2019
Is there a list of tables that's used by the team? And how can I find their schema? So far, I only found these links:
Jun 25 2019
This issue was resolved by T175918. On 2017-09-28 a bug in mw.track was fixed. Before 2017-09-28, if events are logged via mw.track, only events tracked during the first pageview of a user's session were logged. After the bug fix, the number went up again:
Jun 24 2019
This event: {"revision":17984412,"event":{"ts":"2019-06-20T14:53:23-07:00","is_anon":true,"appInstallAgeDays":0,"appInstallID":"5A9CAA64-4A76-4E1D-ACA6-D6ECF6FE023E"},"schema":"MobileWikiAppDailyStats","wiki":"enwiki"}
according to last version of the schema is not valid so I would not expect for it to be on the table even with a null appInstallID
Jun 21 2019
Closing this ticket as its parent task is closed.
I checked the number again for version 6.2.3.1612. I found 250202 unique app install ID in the MobileWikiAppiOSUserHistory table and 264004 IDs in the MobileWikiAppiOSSessions table, which means the difference is only around 5%. So I'm closing this ticket as the difference is small enough to ignore.
Jun 20 2019
Jun 19 2019
Jun 18 2019
@atgo Sorry about the late response to your question:
Closing this ticket as the main purpose of it -- instrumenting for the reading list feature on iOS app -- has already been done.
Jun 17 2019
Thanks @JAllemandou and @Ottomata !
Based on the discussion between @dr0ptp4kt and me, I queried the page IDs of articles translated and read by Indonesian users through Toledo between Mar 18 - Jun 14 2019 (all the existing webrequest data). If these pages have an Indonesian version (linked by the same Wikidata item), I checked their pageviews and see if there is any significant changes before and after the Toledo deployment (Dec 5 2018). We are assuming that the topics got translated during Mar 18 - Jun 14 2019 are the same as those topics got translated around Dec 5 2018, and if Toledo project cannibalized Indonesian Wikipedia, we should see a drop in pageviews for those translated topics after the deployment.
Jun 13 2019
Thanks @NHarateh_WMF ! I just confirmed that this patch (https://github.com/wikimedia/wikipedia-ios/pull/3124) fixed all the bugs.
Jun 11 2019
Hi @NHarateh_WMF, I've finished the QA. Besides the issue of revID, the other thing we need to fix is when edit Wikipedia articles, the action of preview and editSummaryShown are not sent:
Preview | Edit summary |
Hi @NHarateh_WMF ! I have a question: I noticed there is a revID associated with every editing event -- from start, ready to saved successfully. To my understanding, the revID is returned by the API only when the edit is saved successfully, is that right?
Jun 7 2019
Thanks @JAllemandou !
I've tried:
spark.conf.set('spark.driver.memory', '4g')
in the first cell in the notebook, but it doesn't seem to work?
Jun 4 2019
Caveat of this analysis: The topic assignment method we used for this project -- ORES draft topic model -- is not perfect for summarizing the content of articles. For example, we have seen a lot of articles been assigned to the topic of Geography.Countries, while they can fit into other topics better from a human's eye. We want to try other topic modeling methods in the future to get a better idea of our translated content.
This information would help us increase the iOS app users.
May 30 2019
Close the ticket as it is done. Feel free to reopen it if you have any questions.
May 16 2019
Regarding the draft topic model in other languages, here's the reply from Aaron:
May 7 2019
May 6 2019
May 4 2019
Thanks @JKatzWMF !
May 3 2019
In this analysis, I use the ORES draft topic model to get the topics of articles viewed on English Wikipedia in March 2019. The topics from this model are the WikiProject Directory and their mid-level categories. The WikiProject Directory provides a convenient intermediary ontology of WikiProjects that starts with four broad topics: Culture; Geography; History & Society; and Science, Technology and Mathematics (STEM). From there, the directory drills down into mid-level categories and sub-topics and eventually specific WikiProjects. For example, WikiProject Birds exists underneath the path STEM/Science/Animals. In this analysis, we call the WikiProject Directory "broad topic", and the mid-level categories "topic".
CC BY-SA 4.0, EpochFail, File:WikiProject Directory mid-level category abstraction.svg
May 2 2019
@Isaac Thanks for pointing out my mistake! I've updated my report and T219660#5139254 .
May 1 2019
Apr 30 2019
Hey @Miriam , happy to help if you want to try the Bayesian approach :)
Hi @dr0ptp4kt , thanks for creating this ticket! I will need some time to think about how to proceed, and I may reach out to you for some clarifying questions.
Apr 29 2019
@fdans No objection from me. Thank you for looking into this issue!
Apr 25 2019
The first exploration analysis is done: https://analytics.wikimedia.org/datasets/external-automatic-translation/Topics%20of%20articles%20translated%20by%20Google.html
Apr 24 2019
Apr 23 2019
Apr 17 2019
Thanks everyone!