User Details
- User Since
- Jul 8 2019, 8:14 PM (252 w, 3 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- MKampurath (WMF) [ Global Accounts ]
Mon, Apr 29
Wed, Apr 24
In today's team meeting we spoke about approach, creating a prototype table with the corrections and lowering the scope to phase it out for each metric area, starting with pageviews or unique devices.
Tue, Apr 23
We are leaning towards switching off the Prefetch feature since the fraction of requests with no-cookies is 17.5% . numbers in the notebook (last cell)
Mon, Apr 22
I think classifying the issues by data domain will definitely be useful. As a start, we can use workboard columns to do this, unless we decide it would be more useful to use columns for status (e.g. backlog, investigation, fixing...).
Yes! lets start with that.
@nshahquinn-wmf , do you have permission to create the yellow project tags? or can we request to have that created in this task ?
Thu, Apr 18
Ty @nshahquinn-wmf for opening this task. Sharing my initial thoughts below:
- Agree that we should tag all the tasks that reports data anomalies with this tag, for now. Once we use this method and find that the proportion of bot issues surpasses software bugs, or vice versa, we can think about splitting it out.
- I would like us to use this tag for data issues reported Community as well. Not sure if you already intended that but in the examples I saw tasks opened by WMF staff so wanted to make it explicit here
- I'm wondering if instead of one data-problem tag, we should instead have functional area tags like data-problem-traffic, data-problem-readership, data-problem-contributors, data-problem-content to keep track of the areas where we have data problems. But this would mean not having tags for tables that fall outside of functional areas (if we have any). And the quintessential problem of too many tags!
Similar to T307883, Im inclined towards declining this task since -
- they've been open for a while
- we are re-thinking our strategy for Contributors metric since it is now a core annual plan metric. @OSefu-WMF is there a task I can link here for the work you are doing?
Im inclined towards declining this task since -
- they've been open for a while
- we are re-thinking our strategy for Contributors metric since it is now a core annual plan metric. @OSefu-WMF is there a task I can link here for the work you are doing?
Wed, Apr 17
Fri, Apr 12
Observed results:
Querying the daily sum of unique devices across all domains using druid.unique_devices_per_domain_daily or all project families using druid.unique_devices_per_project_family_daily shows no daily data logged for January 2024 and February 2024. There is only one data point logged for both datasets 2024-01-01 and 2024-02-01. It appears this impacts all project families and domains. Daily data returns to expected values as of March 2024.
Thu, Apr 11
I got the below error when upgraded to Pandas 2.2.2
Wed, Apr 10
Can I add more scope to this?
Apr 9 2024
@nshahquinn-wmf can you share the timeline you have planned for removing them from the repo?
Apr 5 2024
- Dig into our data on user traffic and how we’re segmenting traffic
Mar 18 2024
Hi @VirginiaPoundstone this work supports the Essential Work we have to perform of providing movement metrics each month to inform internal staff as well as share externally.
As a part of the monthly movement metrics, we need to analyze trends and investigate anomalies, to inform executive leadership as well as engage various teams to resolve potential issues. This view of the editor data will help us do that.
Mar 14 2024
Summary
We have decided to close out the hypothesis.
Mar 13 2024
Hi @lbowmaker , Since the Contributors metric in SDS1.1.1 was rejected and the hypothesis was disproven, we didn’t go ahead with implementing data stewardship even though data stewards were selected.
Mar 7 2024
Mar 1 2024
Feb 29 2024
Feb 28 2024
thanks @Hghani ! once this is corrected or confirmed as a caveat/known issue we should document it on Datahub.
Feb 23 2024
sounds good ! thanks !!
Feb 20 2024
great, thanks @lbowmaker for reminding me about that task! will review it.
yes, its ok to not implement something this FY, but having a good idea about what is possible and what we should have, is a good starting point for the remainder of this FY.
Feb 16 2024
Dependency: need Essential metric criteria (T342472)
Completed and Closed out - https://app.asana.com/0/1206624865116405/1206625207405713
@lbowmaker / @Ahoelzl : I made this task as a placeholder for upcoming work we are going to be doing for Relevance metric data governance. I'll schedule time for us to discuss further in the coming weeks.
Feb 14 2024
Feb 13 2024
Wanted to note here that the smaller project families that were seeing increases during December every year did not see an increase in 2023-12 after this issue was fixed. see chart
We will be closing out Hypothesis SDS1.1.1 soon and creating new hypothesis to work on Relevance, Content and Programmatic Expense ratio.
Jaime and I have completed the close out report .
We have also drafted the new hypotheses for SDS1.1 in our project charter (see Relevance and Content metrics)
Both are pending review and approval.
thank you !! we can continue the conversations on Slack.
Feb 9 2024
@mpopov , curious to know if some of us in Movement Insights can be added to this mailing list to get the alerts instead of creating and maintaining a separate mailing list. what are your thoughts about why this is a bad idea?
Feb 8 2024
Feb 6 2024
thanks @Ladsgroup. yes I'm able to query the mariadb table for a few wikis and can see '0' in the user_is_temp column. Ive opened T356701 and requested the Data Platform Engineering team to add this field to the tables required for data analysis.
Feb 5 2024
@Ladsgroup Is this added to the user tables in All the wikis? Do you know exactly when this was completed? Is Jan 22, 2024 the correct date based on the merge comment?
IIUC we will not be seeing any data in the user table for this field as the program is not yet rolled out.
Feb 1 2024
Everything looks good.
Methodology is documented here. Will look into moving this to a wiki page.
Jan 31 2024
Currently in QA.
Jan 26 2024
Another issue discovered recently T355608 which could benefit from improving automated bot detection.
Thanks @leila ! :)
pls keep us posted on your review.
Jan 25 2024
@CMyrick-WMF Since you authored this task I'm assuming you had use cases in mind for using the cultural model. Would it be possible for you to add some usecases in the description ? thanks a lot!! <3
Jan 24 2024
I wanted to note here that per the refactored SDS OKRs, the new SDS1.1 KR, "SDS1.1 For three out of the four core metric areas, provide at least 1 metric with documented adherence to essential metric criteria" , we now have a dependency on the approval of the Essential Metrics framework to continue work on SDS1.1
whats the priority on this?
Jan 22 2024
Jan 3 2024
Hi DPE team, Can you pls let me know the status of this request? I was not able to get any results for Sec-Purpose: Prefetch; anonymous-client-ip when querying the x_analytics field.
And a follow up question, can we get this added as a flag or new column to the readers tables derived from webrequest - pageviews_hourly and pageview_daily so its easier for us to query and visualize? pls let me know if this requires a new phab task.