Page MenuHomePhabricator

Phab feature request: Cycle time for a task entering a column to resolution, with support for wildcards
Open, In Progress, MediumPublicFeature

Description

As someone doing program management, I would like to know how long it takes a task to go from creation to resolution (and/or "start work" to resolution), so that I can support teams in prioritizing, retrospecting, and setting expectations with stakeholders about their work.

Two approaches I can think of:

  • The pure age of a task, with a timer that starts as soon as the task is created
  • The age of a task after a certain criteria is met. For instance, a task might not be relevant just sitting in a backlog, but becomes relevant as soon as it is "pulled into" a Kanban board of some kind. I speculate that this could be done with a "start aging" toggle on a task, which could then be triggered by a column when it enters a new workboard.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
JAufrecht renamed this task from Phlogiston: Request -- Cycle time for a task entering a column to resolution, with support for wildcards to Cycle time for a task entering a column to resolution, with support for wildcards.Oct 26 2016, 5:44 PM
JAufrecht triaged this task as Low priority.
JAufrecht added a project: Team-Practices.

@MBinder_WMF What is the priority for this? What other activities is it blocking?

I am interested in this for teams who are interested in a kanban approach. But not sure I would call it high priority...

What Kristen said. The main thing that Phab does not do well when it comes to kanban is cycle time.

There are varying levels of scope here. In order:

  1. Build a cycle time report that works for just this project, with custom cycle steps
  2. Try to genericize the cycle time report steps, so that it works for this project but is a starting point for future projects.

Cost:

  • We will have to build the report more or less from scratch, including data definition, data storage, chart definition, and workflow
  • This will apply only to this specific Phabricator project; other Phabricator projects will need their own column setup to document steps in the cycle (copying from this example or inventing their own), and Phlogiston will need some way to define them.

Benefit:

  • direct berefit to the team asking for it
  • Having a live example of this report could help other teams understand its benefits
  • Should be much cheaper to replicate this report for other projects

Build a cycle time report that works for just this project, with custom cycle steps

This would work as an experiment, but is not sustainable for a team (such as iOS) that frequently changes release boards.

Try to genericize the cycle time report steps, so that it works for this project but is a starting point for future projects.

This is preferable, so that a team like iOS could have the same column on each of their release/sprint boards and have it automatically picked up by Phlog (the way wildcards pick up the new boards themselves).

Per monthly grooming, removing this from Team-Practices knowing that it can be found on the Phlogiston board. @JAufrecht please adjust this action as need (and kindly describe your reasoning for keeping it on Team-Practices if you do restore it to that board).

JAufrecht set the point value for this task to 20.Dec 12 2016, 8:31 PM
MBinder_WMF renamed this task from Cycle time for a task entering a column to resolution, with support for wildcards to Phab feature request: Cycle time for a task entering a column to resolution, with support for wildcards.May 28 2020, 5:36 PM
MBinder_WMF raised the priority of this task from Low to Needs Triage.
MBinder_WMF edited projects, added Phabricator; removed Phlogiston (Reporting).
MBinder_WMF updated the task description. (Show Details)
MBinder_WMF removed the point value for this task.
MBinder_WMF added a subscriber: Naike.

Resurrecting this as a feature request for the Phabricator board, since it's still relevant but Phlogiston is basically dead.

It should be possible to create a custom field class for the start date and apply that field with a trigger attached to a column.

Column triggers have very limited options, currently. We could add a new "started" status which could be triggered by a workboard column trigger.

Not sure I understand correctly, but in case a vague idea here is to add more global task statūs then I'd like to oppose that, as I'd prefer KISS to covering some not-so-common (?) use cases by adding more complexity which is visible for everybody.
Bugzilla had both an assignee field (which could not be empty though) and an "ASSIGNED" task status and I'm quite sure that basically nobody understood why.

In an ideal world, if someone "started" to work on a task then they should set themselves as a task assignee (and that could be queried in the DB).

Thanks, both.

I like the idea of a status, though I don't think it would be mutually exclusive with "Open" or "Resolved" etc, so I'm imagining a different field.

I also appreciate keeping it simple. I'm not sure if this qualifies as an edge-case or not. Pretty much every team I have worked with wants to track cycle time, and do so within the UI (as opposed to, say, tools like Phlogiston).

This is an example of Trello doing something (it's purely visual and more fun that practical, but it's the same idea): https://trello.com/power-ups/55a5d917446f517774210012/card-aging

Perhaps this could be an opt-in feature? Then it would not be visible to anyone save those people that need it (or perhaps by project).

In an ideal world, if someone "started" to work on a task then they should set themselves as a task assignee (and that could be queried in the DB).

Ideally, yes. However, I can think of times when someone assigns themselves to a task for other reasons than starting to work on it, such as fleshing out a stub ticket in a backlog, or grooming old tasks. It also doesn't take into account when someone else assigns a task to a user, as opposed to one being picked up.

Ideally, yes. However, I can think of times when someone assigns themselves to a task for other reasons than starting to work on it, such as fleshing out a stub ticket in a backlog, or grooming old tasks. It also doesn't take into account when someone else assigns a task to a user, as opposed to one being picked up.

Indeed. On the other side, I can also think of people who won't set (or remember to set) some imaginary "started" field (which still feels wrong to me). :)
The more steps, the more likely people won't perform them.

@MBinder_WMF: Are there some examples / use cases for concrete situations? Apart from the last comment and "The age of a task after a certain criteria is met" it feels like there isn't much info. I'm trying to better understand which underlying problem to solve and what would be some specific examples for "certain criteria".

@JMinor @Naike @ARamirez_WMF @aezell Tagging you here as people I know have had an interest in cycle time for tasks. Would you be willing to elaborate on your needs so @Aklapper can think about it more holistically?

Indeed. On the other side, I can also think of people who won't set (or remember to set) some imaginary "started" field (which still feels wrong to me). :)

The more steps, the more likely people won't perform them.

You make a good point! I think in this particular case only one person could set a start time. Similarly, if triggers could be set that would simplify weak links in process that rely on individual engagement.

Another option besides status, is we could have a tag called "Started" which gets added by a column trigger.

I think a column trigger would work well, with the one standout limitation being that, as I understand it, column triggers only occur when a task is moved to a column rather than tagged into it. That is, it wouldn't be enough to tag, say, a Kanban board to start a timer. A task would need to be tagged and then moved to something that represented "start."

I can imagine, using real-life examples, that the first, default column is "Incoming" and the trigger column is "Ready for Development" or "Doing" or some such. However, another weak link would be if tasks needed to skip columns because of the work required.

For example, if a task is tagged with the board and then moved to "Code Review" right away, it would not get triggered. The only way around that is to either get triggers working beyond moving between columns (so tagging directly onto a board would start the trigger), or have each column on the board have the trigger each, with some logic that says, "if 'Started' is not on yet, turn it on, otherwise ignore."

Aside: We could just use task assignments for this purpose if it weren't already a common practice to assign the task long before working on it.

@MBinder_WMF: You are right about the current limitations of column triggers, however, since we are looking for a way to explicitly mark the start of work, it doesn't seem like the limitations are too much of a problem. The teams who are using this would just need to incorporate an explicit step into their workflow:

  • during the sprint kickoff (or whenever is appropriate)
    • drag the tasks for the current sprint into the Doing column. If a task belongs elsewhere it can then be immediately dragged a second time into the desired column.

Yea, if column triggers are simple enough, it's definitely the most straightforward way, for now. The manual task dragging isn't ideal for mitigating process cruft, but overall I think this would be a big step forward.

I can imagine several potential solutions which each have pros and cons. To evaluate them I'd like to know which "certain criteria" people think about, as it's easy to spend time implementing something that isn't the best approach due to not knowing what people actually and specifically want to measure.

Thanks for focusing us, @Aklapper :)

(I realize this may not get to the heart of your question about specificity, but I hope it assists with shared understanding of why this is valuable).

Simply put, Kanban as a practice focuses on visualizing flow, and bottlenecks therein. Being able to see how long a card is in one place is part of forming a narrative about how the flow can be optimized. If this is possible at a high level (e.g. Trello's "decaying card" imagery) it lends itself to immediate unbottlenecking, but even at a more detailed level (e.g. a timer you can see after opening a task) it is useful. It's also handy in retrospect, where, as a team discusses the cause and effect for how process impacted delivery, the team can analyze how long something took to go from commitment to completion and dig deeper as to why.

Related, it also helps teams set expectations (e.g., "on average, we finish a task of X type at Y pace"), not unlike story points (which not all teams leverage). Finally, it can also quickly identify which tasks are regularly carried over, for reasons of prioritization, poor estimation, or something else. As with most process tools or approaches, it is a data point to facilitate efficient and accurate conversation about delivering work.

From another angle, imagine 2 common models I've seen at WMF: Scrum and Kanban. Scrum theoretically improves delivery by insisting teams focus on removing impediments to commitments. If something is blocked, the team devotes effort to unblocking that. Kanban theoretically improves delivery by insisting teams focus on working on what is ready (that is, not blocked), and minimizing complexity ("waste") thereof. In both cases, because that means tasks are being set aside, card age is a useful reminder of just how long something has gone without attention.

Perhaps trying to simplify it even further, Kanban-style approaches work best when cycle times are as short as possible. Short feedback loops make for, theoretically, better feedback, by wasting less time doing the wrong things. If you can track how long something takes to progress, you create a foundation from which to improve how fast you get feedback, at both micro (each task) and macro (each "sprint" or cycle) levels.

It's also worth noting that sometimes people distinguish between "cycle time" and "lead time," where the latter is typically the time a task spends on the board (To Do --> Doing --> Done), and the former is typically the time spent actually working on said task (Doing --> Done). Both are useful, and I would argue the scope of this request, regardless of nomenclature, is To Do --> Doing --> Done.

So I couldn't resist working on this one over the weekend and I've got something that will probably be useful:

Modern MySQL versions can query against fields stored as json blobs. So I was able to concoct a query that pulls all of the column movement transactions for a given project workboard. I think this captures every movement for every task that was ever on the workboard.

The reason it's a challenge is because phabricator databases are partitioned by application and the column movements happen to be stored with the tasks instead of being stored with the workboard. So in order to find out when a task was in each column you have to query the task database, not the column database. Even more fun is had because the information about the source and destination columns are encoded in json, which is where JSON_VALUE comes in handy.

Example query:
SELECT
	objectPHID,
	authorPHID,
	JSON_VALUE(newValue, "$[0].boardPHID") as projectPHID,
	JSON_VALUE(newValue, "$[0].columnPHID") as ToColumnPHID,
	JSON_VALUE(newValue, "$[0].fromColumnPHIDs.*") as fromColumnPHID
FROM
	phabricator_maniphest.maniphest_transaction x
WHERE
	transactionType='core:columns' AND  JSON_VALUE(newValue, "$[0].boardPHID") IN ('PHID-PROJ-nnnsyj7vwcumrkykebbx')
GROUP BY
	objectPHID
ORDER BY 
	dateModified
Example result set:
objectPHIDauthorPHIDprojectPHIDToColumnPHIDfromColumnPHID
PHID-TASK-olufdmdoyisoaxcrd5lhPHID-USER-iermt5ebieihtsbuwabpPHID-PROJ-nnnsyj7vwcumrkykebbxPHID-PCOL-3mrotf5wa6dxe547nvhvPHID-PCOL-3mrotf5wa6dxe547nvhv
PHID-TASK-w6hgmfbdcjyyspdgfeloPHID-USER-iermt5ebieihtsbuwabpPHID-PROJ-nnnsyj7vwcumrkykebbxPHID-PCOL-3mrotf5wa6dxe547nvhvPHID-PCOL-3mrotf5wa6dxe547nvhv
PHID-TASK-hegrajt4zwu4fgba27mhPHID-USER-iermt5ebieihtsbuwabpPHID-PROJ-nnnsyj7vwcumrkykebbxPHID-PCOL-3mrotf5wa6dxe547nvhvPHID-PCOL-3mrotf5wa6dxe547nvhv
Next steps:

There is still some remaining uncertainty in need of more thought and discussion.

With a little bit more filtering (e.g. limit to a date range) I think this is exactly the data that's needed to produce the metric, however, we need a way to identify which columns are "special" for the purpose of calculating the cycle time metrics.

Perhaps the easiest / dumbest way to do it is to use magical column names: If the column name matches "To Do" or "Done" then treat it as such and calculate the metrics and if a project doesn't have the magic column names it will not get a cycle time report?

Great stuff, @mmodell !

Perhaps the easiest / dumbest way to do it is to use magical column names: If the column name matches "To Do" or "Done" then treat it as such and calculate the metrics and if a project doesn't have the magic column names it will not get a cycle time report?

I think you're onto something. Long term, this is probably not the most sustainable, as teams name columns different things, and it's generally in line with whatever works for them. That is, if the generic column name doesn't work for them (for any number of reasons, including but not limited to language), they will be shaping their process to the tool (not a good practice), rather than the shaping the tool to their process (generally a preferred practice).

However, I could see it as a useful way to pilot this (especially for the teams that do use naming conventions that match the generic).

We have timestamps in the DB when someone got assigned to a task, when a task status value was changed, when a task received some project tag, and when a task entered or left the workboard column of some project (though trickier, as Mukunda explained already).

If "time spent on a task" (lead time, or maybe also cycle time) means either the timespan of a task in a certain workboard column (e.g. "Doing"; or either "Done" or "Resolved" status), then this sounds like a measure to look at.

I don't see an argument for introducing any new global task status values as I wrote in T261493, or why to introduce labels (in T261498).

Long term, this is probably not the most sustainable, as teams name columns different things

That's a problem indeed. I see that a social problem to encourage or enforce organization-wide terminology conventions and consistency ("Freezer" vs "Icebox" vs "Frozen", etc), and not a problem to solve by technical means (as we'd end up maintaining an everchanging list of column workboard names for an everchanging list of team projects for a number of everchanging engineering teams in an organization which reorgs from time to time).
I am very wary of technical workarounds (and their maintenance costs for years to come) for (potentially) broken processes, instead of fixing the processes.

If the unstable, beta "Phrequent" application in Phab was enabled (and maybe restricted to only some staff or such because so much interface clutter), there would be a "Start Tracking Time" item in the sidebar of a task. No idea if that's good or bad, though. Also see https://secure.phabricator.com/T4853 and subtasks.

For the sake of keeping folks in the loop, I'll repost my comment from T261493:
I cannot say what's the best option as long as nobody has provided clear, specific, exact, non-abstract cases for named specific teams of some organization explaining which specific activity (start point in time; end point in time) they would like to measure and how that activity (action performed at start point in time; action performed at end point in time) is defined.

Until that happens, I do not see a reason for complicating things and creating maintenance costs by creating new statuses or developing labels:

  • We have assignee fields ("none" or an individual) so people can express that they work on a task.
  • We have workboard columns ("in progress", etc) so people can move tasks around to express that a task is in some state or that something happened.
  • We have task statuses (e.g. "resolved", "stalled") to express that a task has been done, or that it cannot be worked on.
  • I don't see why there should be a fourth item to add to that mix. See the second sentence of this comment.

Hi, @Aklapper !

Long term, this is probably not the most sustainable, as teams name columns different things

That's a problem indeed. I see that a social problem to encourage or enforce organization-wide terminology conventions and consistency ("Freezer" vs "Icebox" vs "Frozen", etc), and not a problem to solve by technical means (as we'd end up maintaining an everchanging list of column workboard names for an everchanging list of team projects for a number of everchanging engineering teams in an organization which reorgs from time to time).
I am very wary of technical workarounds (and their maintenance costs for years to come) for (potentially) broken processes, instead of fixing the processes.

I respect that. I would posit there is a balance to strike here. Specifically, it's generally a recommended practice to focus on "people over tools" in Agile development. With that in mind, if everything is equal and the choice is "change the people" or "change the tool" the latter is preferred and encouraged. Now, obviously everything isn't equal, and we're talking about a complex system. I can say from experience, though, that even when it's ultimately more convenient to have everyone follow engineered protocols, it's difficult to realize in practice.

Philosophy aside, I agree with you that we need specific examples of specific team needs. To that effect, I've reached out to some folks with whom I've had conversations about this recently. Hopefully, they chime in here! :)

Thanks Max, really appreciate the reach out! I agree with "change the tool if it doesn't fit the needs of people".

Hello @Aklapper,

My teams want to deliver working software that creates value for our stakeholders more often, but we don't have automated data to understand where we're running into problems between the start and finish lines, which makes it difficult for us to improve our processes. To give you a concrete example, as a project manager, I should be able to know the baseline number of days a ticket spends in each Kanban column to be able to measure variations. If there were to be a sudden spike, then I would know that something needs my teams' attention. Apart from using cycle time to get retroactive signals when our performance declines, I can also use it to set performance improvement goals and to know if we're hitting our target. Unfortunately, at the moment, I harvest data manually from Phabricator to generate weekly status reports for my team when most project management tools have cycle time built into their reporting capability. The time I spend compiling and processing metrics would be better spent on planning and implementing changes instead. Further, having automated reports would allow us to improve key data quality dimensions such as accuracy, completeness, consistency, timeliness, to name a few.

In short, cycle time is a vital metric for my team, and the way I'm attempting to collect it manually now is not sustainable as it's time-consuming and error-prone. Phabricator should be able to do this for me, and it that's not the case, then I don't have a useful tool to fulfill one of my fundamental duties to my team.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 13 2021, 11:41 AM

This metric was identified during the Data³ MVP meeting as a metric needing experimentation.

We came up with a narrow definition during the meeting:

Cycle time: a per-task, per-workboard timer that measures the time difference between (1) the time at which a task moves out of the default column on a workboard until (2) the time at which either of the following conditions are met: (i) the task leaves the workboard or (ii) the task enters any resolved status (i.e., Resolved, Declined, or Invalid)

Notably absent from the definition is any mention of the Stalled status nor is there any notion of assignment or priority.

thcipriani triaged this task as Medium priority.Apr 23 2021, 6:50 PM
mmodell changed the task status from Open to In Progress.Sep 24 2021, 7:03 PM