Page MenuHomePhabricator

Points data not completely available in dump used by Phlogiston
Closed, ResolvedPublic8 Story Points

Description

UPDATE 26 April 2016: This task is probably the cause of tasks in VE having the wrong points value; tasks are pointed one way in Phabricator UI and show up null in Phabricator due to complications of how points data has evolved with WMF's Phabricator install. So the scope of this task is to 1) try and fix that, and 2) wrap up related issues below. Once any overlap is determined, this should probably be broken up.

Original Issue

Looks like story points may not be fully represented

T120912 has 0.5 story points. That's missing from the dump:

>>> data['task']['120912']['storypoints']
''

T117474 has 2 story points, which is present:

>>> data['task']['117474']['storypoints']                                                
[2395689, 'PHID-TASK-g66nzaiw3btbu6vdpagj', '2']

Story points are not provided in the transaction log.

>>> data['task']['120912']['transactions'].keys()
dict_keys(['core:edge', 'status', 'projectcolumn', 'reassign', 'priority'])
>>>
>>> data['task']['117474']['transactions'].keys()
dict_keys(['core:edge', 'status', 'projectcolumn', 'reassign', 'priority'])
>>>

Phlogiston builds most of its data from the transaction log, so that it can do historical reconstruction. Story Points were not previously available in the transaction log.

Questions:

  1. Why isn't T120912`s point value present in the dump?
  2. With the upgrade, are point values now in the transaction log?
  3. If so, can that information be added to the dump?
  4. If so, why are story points also provided as task-level, as-of-now data?
  5. If so, what value should I use to set point value retroactively for all dates prior to the upgrade?

Event Timeline

Restricted Application added subscribers: StudiesWorld, scfc, Aklapper. · View Herald TranscriptFeb 22 2016, 10:28 PM
JAufrecht updated the task description. (Show Details)Feb 22 2016, 10:36 PM

Wild guesses / theories:

  1. Why isn't T120912`s point value present in the dump?

In our config, maniphest.fields says "isdc:sprint:storypoints": { "key": "isdc:sprint:storypoints", "disabled": true }, (as that storypoints were a custom field).
Story points in T117474 were set before the upgrade, hence using the custom field storypoints which likely still exists in our DB.
https://phabricator.wikimedia.org/diffusion/PHTO/browse/master/public_task_dump.py;9f03717045bb01bc762b7404fe8cb8282b175be7$37 calls https://phabricator.wikimedia.org/diffusion/PHTO/browse/master/wmfphablib/phabdb.py;9f03717045bb01bc762b7404fe8cb8282b175be7$58 which queries FROM maniphest_customfieldstringindex.

However, points in T120912 were set in https://phabricator.wikimedia.org/T120912#2041850 after the upgrade (which introduced maniphest.points in the config) using the non-custom field points. Hence the query in the script might require updating.

  1. With the upgrade, are point values now in the transaction log?

Yes for the non-custom field points (tested only on my local instance): The field transactionType can have the value points in the maniphest_transaction table in the phabricator_maniphest database:

MariaDB [phabricator_maniphest]> SELECT * FROM maniphest_transaction WHERE transactionType = "points";
+--------+--------------------------------+--------------------------------+--------------------------------+------------+--------------------------------+-------------+----------------+-----------------+----------+----------+------------------------------+----------+-------------+--------------+
| id     | phid                           | authorPHID                     | objectPHID                     | viewPolicy | editPolicy                     | commentPHID | commentVersion | transactionType | oldValue | newValue | contentSource                | metadata | dateCreated | dateModified |
+--------+--------------------------------+--------------------------------+--------------------------------+------------+--------------------------------+-------------+----------------+-----------------+----------+----------+------------------------------+----------+-------------+--------------+
| 236999 | PHID-XACT-TASK-2c2x6sxmvxi55ky | PHID-USER-cgilgxteicxndvcw5w2t | PHID-TASK-oxn4g77oav5gw6jlldup | public     | PHID-USER-cgilgxteicxndvcw5w2t | NULL        |              0 | points          | null     | 0.5      | {"source":"web","params":[]} | []       |  1456226979 |   1456226979 |
| 237001 | PHID-XACT-TASK-364pzzqe2ub7vq2 | PHID-USER-cgilgxteicxndvcw5w2t | PHID-TASK-oxn4g77oav5gw6jlldup | public     | PHID-USER-cgilgxteicxndvcw5w2t | NULL        |              0 | points          | 0.5      | 3.1418   | {"source":"web","params":[]} | []       |  1456227730 |   1456227730 |
+--------+--------------------------------+--------------------------------+--------------------------------+------------+--------------------------------+-------------+----------------+-----------------+----------+----------+------------------------------+----------+-------------+--------------+
  1. If so, why are story points also provided as task-level, as-of-now data?

I guess (I did not check) the script simply dumps all columns available in the maniphest_task table in the phabricator_maniphest database, and points is one of those columns:

MariaDB [phabricator_maniphest]> SELECT * FROM maniphest_task WHERE points IS NOT NULL;
+-------+--------------------------------+--------------------------------+-----------+--------+----------+--------------------------------------------+--------------------+-------------------+-------------+--------------+----------------------+---------------+---------------------+-------------+------------+------------+-----------+------------+--------+
| id    | phid                           | authorPHID                     | ownerPHID | status | priority | title                                      | originalTitle      | description       | dateCreated | dateModified | mailKey              | ownerOrdering | originalEmailSource | subpriority | viewPolicy | editPolicy | spacePHID | properties | points |
+-------+--------------------------------+--------------------------------+-----------+--------+----------+--------------------------------------------+--------------------+-------------------+-------------+--------------+----------------------+---------------+---------------------+-------------+------------+------------+-----------+------------+--------+
| 60002 | PHID-TASK-oxn4g77oav5gw6jlldup | PHID-USER-cgilgxteicxndvcw5w2t | NULL      | open   |       90 | Some non sec task (with 0.5 Story points)  | Some non sec task  | Some non sec task |  1426846896 |   1456227730 | u6ycaouxzl3xwcvzyhgr | NULL          | NULL                |           0 | users      | users      | NULL      | []         | 3.1418 |
+-------+--------------------------------+--------------------------------+-----------+--------+----------+--------------------------------------------+--------------------+-------------------+-------------+--------------+----------------------+---------------+---------------------+-------------+------------+------------+-----------+------------+--------+

I can modify the script, but it sounds like we should just use the transaction log going forward.

JAufrecht triaged this task as High priority.Feb 24 2016, 11:05 PM
JAufrecht added a project: User-JAufrecht.
JAufrecht set the point value for this task to 8.Mar 10 2016, 6:50 PM
Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptApr 19 2016, 10:41 PM

Still not getting points correctly. See https://phabricator.wikimedia.org/T132495 for an example; has 40 points in the UI and null according to Phlogiston.

JAufrecht updated the task description. (Show Details)Apr 27 2016, 12:04 AM

@JAufrecht: The changes in rPHTOe5bd9268a7eb should fix this by adding the 'points' field and transactions of type 'points,' however, I haven't deployed this change yet because I want to wait until after we have one successful dump to be sure that everything is working. There is a chance that this code won't work in production and I don't want to mix multiple changes together, so one thing at a time.

@JAufrecht: ok I've deployed e5bd9268a7eb, points data should be in the next dump.

I'm seeing that points is now available as the last field in task info:

data['task']['100275']['info']
[100275, 'PHID-TASK-uyatxib4hsmcqrnzn44f', 'PHID-USER-fovtl67ew4l4cc3oeypc', None, 'open', 50, "Highlights don't adequately block link hover effects in Chrome", 1432547376, 1453738628, 103.68972124558, 8.0]

Which is consistent with the patch.

It's not necessarily available in transactions:

>>> data['task']['100275']['transactions']['points']
>>>

But I think that may be an issue with the underlying data, where tasks created with the Sprint extension or before the February 2016 upgrade have points but no transactions. Sampling a task created and pointed after February, there is point data in the transaction log.

>>> data['task']['129687']['transactions']['points']
[[2112769, 'PHID-XACT-TASK-7c3dnihrybht7rb', 'PHID-USER-mzjfuzwqhxgtksmcqpn3', 'PHID-TASK-7hbi4ulfsjvlkgx3budv', None, 0, 'points', 'null', '1', '{"core.create":true}', 1457724498, 1457724498]]
>>>

I think I can work with this by using transactional data where available and falling back to as-is data if necessary. This should be an improvement over current data quality. Thanks!

Phabricator data is loaded in Phlogiston seems (based on spot checking) to have the current points value in the task info (maniphest_task) and, if points has been changed since the Feb 2016 upgrade, to have transaction information in maniphest_transactions.

JAufrecht removed JAufrecht as the assignee of this task.May 10 2016, 5:43 PM

@JAufrecht who on TPG needs to review this and what should be reviewed?

This task is basically internal to Phlogiston, and the change is only indirectly visible in the reports, so probably I'm the only possible reviewer other than somebody investigating.

JAufrecht closed this task as Resolved.May 12 2016, 9:06 PM
JAufrecht moved this task from Needs Review to Done on the Team-Practices (This-Week) board.