Page MenuHomePhabricator

Run a query to pull the number of Phabricator Tasks reported by community, resolved by WMF
Closed, DeclinedPublic

Description

For the Infrastructure quarterly Learning Session, Product & Tech senior leadership want to speak as to the number of Phabricator Tasks reported by community that were resolved by WMF.
Senior Leadership is also attempting to identify possible reasons as to why we close more tasks in certain quarters, and we are working with Product Analytics to analyze the data.

Given that, I need to request 2 CSV reports this time

  • phabricator tasks resolved by WMF for this FY Q4 (April 2024 - June 2024)
  • phabricator tasks resolved by WMF for the Q2 for (October 2023 - December 2023)

I'd need the reports by end of working day Thursday July 18, 2024 so we can analyze the data before the QLS.

If it helps, the SQL query that was used in the past is included below:

mysql:phstats@m3-slave.eqiad.wmnet [phabricator_maniphest]> SELECT DISTINCT CONCAT("https://phabricator.wikimedia.org/T", t.id) AS url, t.title, u1.userName AS author, u2.userName AS assignee FROM phabricator_maniphest.maniphest_task t INNER JOIN phabricator_user.user u1 ON u1.phid = t.authorPHID INNER JOIN phabricator_user.user u2 ON u2.phid = t.ownerPHID WHERE t.status = "resolved" AND FROM_UNIXTIME(t.closedEpoch,'%Y%m')>=202401 AND FROM_UNIXTIME(t.closedEpoch,'%Y%m')<=202403 AND u1.phid NOT IN (SELECT ua.userPHID FROM phabricator_user.user u INNER JOIN phabricator_user.user_externalaccount ua ON ua.userPHID = u.phid WHERE ua.accountType = "mediawiki" AND ((ua.username LIKE '%(WMF)' OR ua.username LIKE '%-WMF') OR (ua.username LIKE '%(WMDE)' OR ua.username LIKE '%-WMDE'))) AND u1.phid NOT IN (SELECT ue.userPHID FROM phabricator_user.user u INNER JOIN phabricator_user.user_email ue ON ue.userPHID = u.phid WHERE (ue.address LIKE '%@wikimedia.org' OR ue.address LIKE '%@wikimedia.de' OR ue.address LIKE '%@speedandfunction.com')) AND (u1.userName != "dbarratt" AND u1.userName != "Samwilson" AND u1.userName != "Amire80" AND u1.userName != "Dreamy_Jazz" AND u1.userName != "Ladsgroup" AND u1.userName != "Tchanders" AND u1.userName != "Daimona" AND u1.userName != "Nikerabbit" AND u1.userName != "Mooeypoo" AND u1.userName != "demon") AND (u2.phid IN (SELECT ua.userPHID FROM phabricator_user.user u INNER JOIN phabricator_user.user_externalaccount ua ON ua.userPHID = u.phid WHERE ua.accountType = "mediawiki" AND ((ua.username LIKE '%(WMF)' OR ua.username LIKE '%-WMF') OR (ua.username LIKE '%(WMDE)' OR ua.username LIKE '%-WMDE'))) OR u2.phid IN (SELECT ue.userPHID FROM phabricator_user.user u INNER JOIN phabricator_user.user_email ue ON ue.userPHID = u.phid WHERE (ue.address LIKE '%@wikimedia.org' OR ue.address LIKE '%@wikimedia.de' OR ue.address LIKE '%@speedandfunction.com')) OR  (u2.userName = "dbarratt" OR u2.userName = "Samwilson" OR u2.userName = "Amire80" OR u2.userName = "Dreamy_Jazz" OR u2.userName = "Ladsgroup" OR u2.userName = "Tchanders" OR u2.userName = "Daimona" OR u2.userName = "Nikerabbit" OR u2.userName = "Mooeypoo" OR u2.userName = "demon"));

And we will continue to add the disclaimer that as long as staff is allowed to use non-staff accounts and non-staff email addresses for staff activity, this data will always be incomplete/incorrect.

Event Timeline

@ARamirez_WMF: For the most recent quarter this is already automated. See the email Phabricator data for WMF QLS - Q2/2024 triggered 2024-07-01 00:00:01UTC in your email inbox.

@ARamirez_WMF would it be possible to get folks in Product Analytics to help us wrangle this data? I think we can give them read-only access to the phab db. We're struggling to generate CSVs that we feel are accurate and answer the underlying questions you have, so help from Product Analytics in sorting through raw data (and enumerating all the caveats) would be helpful for us.

Hi @thcipriani

As for what I need recurring quarterly, this previous report contains the fields we've needed in the past.

I'm gong to set up a 15 minute check for us next week and I'll be adding Lauren who has been helping us analyze this data (she has some data analysis skills and she's been leveraging Office Hours with Product Analytics because PA has no more bandwidth and cannot take this on). Lauren has a good understanding of what data she needs to work on the quarterly analysis we are trying to wrap up. And that's the data I'm parsing for the QLS.

Hi @thcipriani

As for what I need recurring quarterly, this previous report contains the fields we've needed in the past.

I'm gong to set up a 15 minute check for us next week and I'll be adding Lauren who has been helping us analyze this data (she has some data analysis skills and she's been leveraging Office Hours with Product Analytics because PA has no more bandwidth and cannot take this on). Lauren has a good understanding of what data she needs to work on the quarterly analysis we are trying to wrap up. And that's the data I'm parsing for the QLS.

That'd be great! And, yeah, we can definitely generate the dump needed, but we could use some more eyes and thoughts on how to increase the overall accuracy of the dumps we've been providing. My hope is if we can put our heads together as the folks who know phab + the folks who are used to wrangling messy data, then future dumps will be improve!

Aklapper claimed this task.
  • phabricator tasks resolved by WMF for this FY Q4 (April 2024 - June 2024)

This was automatically done per T369176#9949730 (with all its many caveats why not to use this data)

  • phabricator tasks resolved by WMF for the Q2 for (October 2023 - December 2023)

I believe that this is not a good idea for numerous reasons (e.g. changes of affiliation of the very same Phabricator account in the meantime, due to lack of WMF account policies) so I'm going to decline this ticket to clean my backlog.