Page MenuHomePhabricator

Try to identify new developers (via assignee field) in Phab tasks and potentially follow up
Closed, ResolvedPublic

Description

Related to T188244: Develop an easy to use workflow to monitor new developer patches that land in Gerrit - we might have new potential developers entering via Phab (assigning a task to themselves?) before maybe managing to get a patch up on Gerrit.

Been wondering lately if there's a quick'n'dirty way to track that a bit better (but: perfect is the enemy of good).

Could (ab)use the weekly Phab projects email in https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/phabricator/templates/project_changes.sh.erb by adding yet another SQL query.

Event Timeline

SELECT DISTINCT CONCAT("https://phabricator.wikimedia.org/p/", u.userName) AS userName
FROM phabricator_user.user u 
JOIN phabricator_maniphest.maniphest_transaction t
ON u.phid = TRIM(BOTH '\"' FROM t.newValue)
WHERE u.isDisabled = 0
AND t.transactionType = "reassign"
AND t.dateModified > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 WEEK))
AND t.newValue IN 
(SELECT t2.newValue 
FROM phabricator_maniphest.maniphest_transaction t2 
WHERE t2.transactionType = "reassign" 
AND t2.newValue != "null" 
GROUP BY t2.newValue 
HAVING COUNT(*) < 5)
ORDER BY userName;

lists all Phab user account names (as links) who are enabled and got a task assigned in the last week and in total had less than five tasks ever assigned to themselves in their Phab history. (Not sure about the performance of the JOIN. Might be horrible.)

Tested locally (newly created user account claiming tasks or getting assigned by another comment; unassigning and reclaiming; threshold of 4) and the SQL query works as expected.

Change 435984 had a related patch set uploaded (by Aklapper; owner: Aklapper):
[operations/puppet@production] phabricator: List new and recent assignees

https://gerrit.wikimedia.org/r/435984

Querying the complete transaction table and counting for each person how often they got assigned to a task is way too expensive and was stupid of me.
Way cheaper is querying for the registration date of the user account (static info, no COUNT calculation needed) which sounds also okay.

So https://gerrit.wikimedia.org/r/#/c/435984/3/ (PS3) uses:
List all users who got a task assigned in the last week && who registered their Phab account in the last six weeks:

SELECT DISTINCT CONCAT("https://phabricator.wikimedia.org/p/", u.userName) AS userName
FROM phabricator_user.user u 
JOIN phabricator_maniphest.maniphest_transaction t
ON u.phid = TRIM(BOTH '\"' FROM t.newValue)
WHERE u.dateCreated > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 6 WEEK))
AND t.transactionType = "reassign"
AND t.dateModified > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 WEEK))
ORDER BY userName;

@Aklapper have you looked to see if it would be possible to get the data you need from https://dumps.wikimedia.org/other/misc/phabricator_public.dump and avoid the sql layer entirely?

Hmm, interesting idea, thanks.
Looks like the 4th value under [info] is the assignee ID. But looking at "transactions": { "core:edge" } there are only {\"edge:type\":41} edges which describes project-task relations. No info included when a user registered. Maybe I could query on the total number of assignments ever for each user ID but that means I need to iterate over all user accounts, plus I still could not filter by recent assignment except for by comparing to a previous dump. At least that is my understanding after looking at the JSON dump for 30 minutes via more phabricator_public.dump | jq .. Meh.

Last SQL idea: Find a way to avoid querying expensive phabricator_maniphest.maniphest_transaction at all.
Let's dig:

  • SELECT * FROM phabricator_maniphest.maniphest_task WHERE dateModified > 1528465401; provides the new assignee value in ownerPHID but dateModified obviously covers any changes to tasks and no way to identify that a change was an assignee change.
  • Taking the phid of the task from the previous query results and looking at edges via SELECT * FROM phabricator_maniphest WHERE src = "PHID-TASK-fooooo"; I only get unrelated edge.types. That is because 1:1 relations (such as assignee:task) are stored in phabricator_maniphest.task.
  • But... phabricator_maniphest.maniphest_task.ownerOrdering exists which shows the name of the task assignee. Huh! So...
SELECT DISTINCT CONCAT("https://phabricator.wikimedia.org/p/", u.userName) AS userName
FROM phabricator_maniphest.maniphest_task t
JOIN phabricator_user.user u ON u.userName = t.ownerOrdering
WHERE t.dateModified > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 WEEK))
AND u.dateCreated > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 6 WEEK))
AND t.ownerOrdering IS NOT NULL
ORDER BY userName;

We'd also get registered-within-last-6-weeks users listed who task assignees and anything in that task changed in the last week, but I don't consider that a big problem because I consider the number of "users registered in the last six weeks && being a task assignee" small enough. Ignore the actual assignment date, too expensive to query on that.

MariaDB [phabricator_maniphest]> explain SELECT DISTINCT CONCAT("https://phabricator.wikimedia.org/p/", u.userName) AS userName FROM phabricator_maniphest.maniphest_task t LEFT JOIN phabricator_user.user u ON u.userName = t.ownerOrdering WHERE t.dateModified > 0 AND u.dateCreated > 0 AND t.ownerOrdering IS NOT NULL;
+------+-------------+-------+--------+--------------------------------+----------+---------
+---------------------------------------+-------+------------------------------------+
| id | select_type | table | type   | possible_keys                  | key      | key_len 
| ref                                   | rows  | Extra                              |
+------+-------------+-------+--------+--------------------------------+----------+---------
+---------------------------------------+-------+------------------------------------+
|  1 | SIMPLE      | t     | ALL    | ownerOrdering,key_dateModified | NULL     | NULL    
| NULL                                  | 58091 | Using where; Using temporary       |
|  1 | SIMPLE      | u     | eq_ref | userName                       | userName | 258     
| phabricator_maniphest.t.ownerOrdering |     1 | Using index condition; Using where |
+----+-------------+-------+--------+--------------------------------+----------+---------
+---------------------------------------+-------+------------------------------------+
2 rows in set (0.00 sec)

Committed as PS4 in https://gerrit.wikimedia.org/r/#/c/435984/

+---------------------------------------------------+
| userName                                          |
+---------------------------------------------------+
| https://phabricator.wikimedia.org/p/ASiplas       |
| https://phabricator.wikimedia.org/p/Kaioduarte-TB |
| https://phabricator.wikimedia.org/p/Shikeishu     |
| https://phabricator.wikimedia.org/p/Sparrow       |
+---------------------------------------------------+
4 rows in set (0.04 sec)

Change 435984 merged by Dzahn:
[operations/puppet@production] phabricator: List new and recent assignees

https://gerrit.wikimedia.org/r/435984

Thanks Dzahn (for patience and reviewing and merging)!

(We'll see the results of this once https://gerrit.wikimedia.org/r/439581 has been merged.)

Vvjjkkii renamed this task from Try to identify new developers (via assignee field) in Phab tasks and potentially follow up to z4baaaaaaa.Jul 1 2018, 1:07 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Aklapper as the assignee of this task.
Vvjjkkii raised the priority of this task from Low to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
CommunityTechBot renamed this task from z4baaaaaaa to Try to identify new developers (via assignee field) in Phab tasks and potentially follow up.Jul 2 2018, 9:05 AM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to Aklapper.
CommunityTechBot lowered the priority of this task from High to Low.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added subscribers: gerritbot, Aklapper.

Change 443401 had a related patch set uploaded (by Aklapper; owner: Aklapper):
[operations/puppet@production] phabricator weekly project changes email: Ignore disabled new assignees

https://gerrit.wikimedia.org/r/443401

| https://phabricator.wikimedia.org/p/Arch2all        |
| https://phabricator.wikimedia.org/p/BPirkle         |
| https://phabricator.wikimedia.org/p/Celestinesucess |
| https://phabricator.wikimedia.org/p/nray            |
| https://phabricator.wikimedia.org/p/pseol2190       |

Change 443401 merged by Dzahn:
[operations/puppet@production] phabricator weekly project changes email: Ignore disabled new assignees

https://gerrit.wikimedia.org/r/443401