Page MenuHomePhabricator

Define Metrics for Change Failure Percentage
Open, Needs TriagePublic

Description

Provide support to the Release Engineering team for the following KR:
Objective: Culture, Equity and Team Practices
Key Result 1: [...] For all supporting services within this slice developed at the Foundation, including MediaWiki, change failure percentage is reduced by 50% while keeping the deployment frequency steady.

  • Analyze data from all deployment trains since 2016, collected by @thcipriani. Inital code and data available at this GitLab repo.
  • Discuss and prototype different candidate metrics for change failure percentage.

Event Timeline

brennen moved this task from Backlog to Radar on the User-brennen board.
brennen added a subscriber: brennen.

Presented the results yesterday at Release Engineering's lunch and learn:

Todos after the meeting:

  • Rethink the final metric including all signals that cause overhead work for the team: bugs, rollbacks, blockers, with special focus on the ones happening on the third day of deployment.
  • Analyze more in-depth the filenames and their relation with train delays, to potentially come up with a list of "problematic filenames" which can suggest potentially problematic patches.