Page MenuHomePhabricator

Monthly report of total / active Phabricator users
Closed, ResolvedPublic

Description

Thanks to @Dzahn we have been collecting the number of active Bugzilla users every month for almost two years now. It would be useful to collect this data --or any approximation-- on a monthly basis:

  • Number of accounts created
  • Number of users logging
  • Number of users performing any action in Maniphest

Extended Play

It would be interesting to know whether it is possible to extract these values as well (total and past month):

  • Number of task creators
  • Number of comment posters
  • Number of task closers
  • Number of workboard movers
  • Number of tasks created
  • Number of tasks closed
  • Median age in days of open tasks by priority

Very Extended Play

Then we could open two parallel ways:

  • Other metrics per app: who is subscribing to Projects, who uses Differential...
  • Filter these metrics by project, with a possibility to aggregate projects (i.e. let's count only the engineering projects)

The extended plays are inspirational only. This task can be closed as soon as we can publish monthly metrics for total amount of users and active in the past month.

First report, sent on January 1st (for the record)

Hi Community Metrics team,

this is your automatic monthly Phabricator statistics mail.

Number of accounts created in (2014-12): 395
Number of active users (any activity) in (2014-12): 619
Number of task authors in (2014-12): 384
Number of users who have closed tasks in (2014-12): 213
Number of tasks created in (2014-12): 3424
Number of tasks closed in (2014-12): 2529
Number of tasks in the shell project closed as resolved,fixed in (2014-12): 12

Number of open and stalled tasks in total: 18125

Median age in days of open tasks by priority:
Unbreak now: 115
Needs Triage: 147
High: 199
Normal: 437
Low: 686
Needs Volunteer: 536

TODO: Numbers which refer to closed tasks might not be correct, as described in T1003.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
In T1003#793868, @Dzahn wrote:

Once we are at the point where we have more specific SQL queries you want to be executed and results mailed to people, ping me and I can puppetize that similar to how the BZ mails worked in the past.

@Dzahn: I hereby ping you. :)
If you could put the above SQL queries (which at least worked locally here for me) into a monthly cronjob email like the previous "bugzilla stats - 2014-10" ones (we can test now by triggering the one for November), that would be lovely.
The last query listed here needs to be included several times to also cover those other x.priority values available.
And I can probably finetune&review a patch in Gerrit...

Nemo_bis added a project: wikimedia.biterg.io.
Nemo_bis edited projects, added acl*sre-team; removed Phabricator.

Note to myself: Add another query for "shell requests closed in last month" for Guillaume as the UI does not offer that and add Guillaume to recipients. (I've also had two other ideas how to get that data but for this one I know it'll work.)

Volume of tasks in the "shell" project resolved/fixed in last month, requested by Guillaume

Hmm, no, I can't https://phabricator.wikimedia.org/conduit/method/maniphest.query/ see supporting that query either. Meh.

This query seems to work locally for me, but it has the same problem as the other "closed" queries above:

SELECT COUNT(DISTINCT objectPHID) FROM maniphest_transaction JOIN maniphest_nameindex JOIN edge WHERE (maniphest_nameindex.indexedObjectName="shell" AND maniphest_nameindex.indexedObjectPHID=edge.dst AND edge.src=maniphest_transaction.objectPHID) AND (transactionType="status" AND (oldValue="\"open\"" OR oldValue="\"stalled\"") AND newValue="\"resolved\"") AND FROM_UNIXTIME(maniphest_transaction.dateCreated,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m');

we could probably create a 'stats' app in phabricator that would provide a simple ui for executing these queries and displaying the data in a reasonably nice way. It would be a significant amount of work but not huge or difficult.

Let's have a separate ticket for some stats app in Phab.
To avoid confusion:
This ticket was originally about the cronjob that once a month sent an email with a single number to Quim and me: The number of active users in Bugzilla in the last month.
We want that for Phab too as we like to have some basic metrics.

As Phab's reporting tools are a bit limited currently and as we have not ported the backend of korma.wmflabs.org to support Phab (see T28) we're now missing some more statistics that we had on korma.
Hence as a poor man's solution, I'd like to extend that cronjob script by some more sql queries, for the time being. Some of them provide data that we had on korma, though now in a non-visual way, but better than nothing for the time being.

Let's have a potential future "Stats" app in Phab either handled in T28, or in a separate ticket. :)

There is a "Fact" app that is pretty young, but it may make more sense to put our time into that https://secure.phabricator.com/fact/

There is a "Fact" app that is pretty young, but it may make more sense to put our time into that https://secure.phabricator.com/fact/

Upstream told us clearly that it's not worth for us to try to step in to improve Fact or surroundings, before they do some homework in the background. This is one of the reasons why I'm proposing to start humble with sql queries. See https://secure.phabricator.com/T1562#80158 and below.

start humble with sql queries

Speaking of which. As the Bugzilla weekly report is no more :[, this update could be sent directly to wikitech-l and called "Phabricator monthly report", with an expectation that new content might be added to it later (or not).

Change 177792 had a related patch set uploaded (by Nemo bis):
phabricator: community metrics stats mail

https://gerrit.wikimedia.org/r/177792

Patch-For-Review

When I saw the wiki with number of active bugzilla users, I wondered if you knew about the graph extension that lets you embed graphs into wiki pages. This was also an excuse for me to play with the extension:

bugzilla users: https://www.mediawiki.org/wiki/Talk:KLeduc_(WMF)/test_graph_extension
the data: https://www.mediawiki.org/wiki/Talk:KLeduc_(WMF)/test_graph_extension_data
the extension: https://www.mediawiki.org/wiki/Extension:Graph

This might be a cheap way to get a visualization of the data.

This might be a cheap way to get a visualization of the data.

Thanks. We might play with this once we have the actual queries in place and their text output working.

Status update:

  • Daniel triggered a "Phabricator monthly statistics" test email to my account last week.
  • For unknown reasons (those SQL queries work locally, maybe the cut -d " " -f3 in the script itself but hard to imagine), the following two values in that email are just empty:
    • Number of tasks created in (2014-11):
    • Number of open and stalled tasks in total:
  • Daniel had to kill the queries I had pasted above under Median age of open tasks by priority as they have a horrible runtime. Yesterday I created 2500 tasks in local Phab instance and it already took 50-75sec so I stole a better query from stackoverflow. Output difference is small enough as we're talking about days here. Comparison below.
MariaDB [phabricator_maniphest]> SELECT x.dateCreated from maniphest_task x, maniphest_task y WHERE (x.priority = "80" AND x.status = "open") GROUP BY x.dateCreated HAVING SUM(SIGN(1-SIGN(CAST(y.dateCreated AS SIGNED)-CAST(x.dateCreated AS SIGNED))))/COUNT(*) > .5 LIMIT 1;                           +-------------+
| dateCreated |
+-------------+
|  1418715703 |
+-------------+
1 row in set (59.33 sec)

MariaDB [phabricator_maniphest]> SELECT avg(t1.dateCreated) as median_val FROM (SELECT @rownum:=@rownum+1 as `row_number`, d.dateCreated FROM maniphest_task d, (SELECT @rownum:=0) r WHERE (d.priority = "80" AND d.status = "open") ORDER BY d.dateCreated) as t1, (SELECT COUNT(*) AS total_rows FROM maniphest_task d WHERE (d.priority = "80" AND d.status = "open")) as t2 WHERE 1 AND t1.row_number IN ( floor((total_rows+1)/2), floor((total_rows+2)/2));
+-----------------+
| median_val      |
+-----------------+
| 1418715752.0000 |
+-----------------+
1 row in set (0.24 sec)
  • For unknown reasons (those SQL queries work locally, maybe the cut -d " " -f3 in the script itself but hard to imagine), the following two values in that email are just empty:
    • Number of tasks created in (2014-11):
    • Number of open and stalled tasks in total:

when i run the query for "tasks created" directly on the database i get this:

MariaDB MISC m3 localhost phabricator_maniphest > SELECT COUNT(*) FROM maniphest_task WHERE 
    ->     FROM_UNIXTIME(dateCreated,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m');
+----------+
| COUNT(*) |
+----------+
|     1841 |
+----------+

in the script the variable $result_taskscreated" becomes this though:

"created: COUNT(*)
1841"

i added the " , but the variable contains COUNT(*) and the line break and then the actual value.

I've added a few more patchsets / revisions to https://gerrit.wikimedia.org/r/#/c/177792/ :

  • worked around (fixed?) the two empty SQL query results by setting an empty column name via AS ' '
  • added the way more performant "median age of open tasks by priority" queries (see last comment)
  • removed backticks surrounding one variable to not get an invalid SQL query error for the median-age queries when calling those SQL queries from a bash script, works also as expected without them
  • as the new median-age queries' output is a float instead of an integer, cut the .0000 resp .5000 suffixes so there's no "invalid arithmetic operator" syntax error anymore

Works locally. Now waiting for Daniel to trigger another test email to me.

Works locally. Now waiting for Daniel to trigger another test email to me.

yep, works for me. i ran it on iridium and triggered the test email

Change 177792 merged by Dzahn:
phabricator: community metrics stats mail

https://gerrit.wikimedia.org/r/177792

Dzahn mentioned this in Unknown Object (Diffusion Commit).Dec 16 2014, 3:19 PM

merged PS10 https://gerrit.wikimedia.org/r/#/c/177792/

puppet-compiler: http://puppet-compiler.wmflabs.org/558/change/177792/html/iridium.eqiad.wmnet.html

on iridium:

Notice: /Stage[main]/Role::Phabricator::Main/Phabricator::Logmail[communitymetrics]/Cron[phabstatscron_communitymetrics]/ensure: created
Notice: /Stage[main]/Role::Phabricator::Main/Phabricator::Logmail[communitymetrics]/File[/srv/phab/tools/community_metrics.sh]/ensure: created

needed one more small fix:

https://gerrit.wikimedia.org/r/#/c/180193/

but now it works, puppet applied it and i tested it.

the communitymetrics@ mail alias should now receive this. (currently qgil and aklapper)

i ran it once manually for you to get it now without having to wait for the cronjob.

let me know if you don't receive it daily from now on

In T1003#850735, @Dzahn wrote:

let me know if you don't receive it daily from now on

Uh, daily? The idea was monthly (see topic), otherwise the queries using INTERVAL 1 MONTH would likely create some weird results. :)

This is just great. I'm so happy. Thank you!

Is it possible to add these to the report?

Number of accounts created
Number of users logging

And I agree with @Nemo_bis, sending this email to wikitech-l every month would be great.

can I get on this list?

You need to bribe mutante.

In T1003#851812, @Qgil wrote:

Is it possible to add these to the report?

Should be. A bit late so dropping that here instead of cooking up a Gerrit patch directly:

Number of accounts created

SELECT COUNT(id) FROM user WHERE FROM_UNIXTIME(dateCreated,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m');
but using $sql_name in the script won't work here as that is on the "phabricator_user" DB instead.

Number of users logging

Already covered by "Volume of contributors (any activity) in last month" already, I think? If not, what's the 𝝙?
There's a "dateMotified" column but not sure what that really refers to, probably not helpful.

Hi Community Metrics team,

this is your automatic monthly Phabricator statistics mail.

Number of active users (any activity) in (2014-11): 433
Number of task authors in (2014-11): 269
Number of users who have closed tasks in (2014-11): 115
Number of tasks created in (2014-11): 1841
Number of tasks closed in (2014-11): 814
Number of tasks in the shell project closed as resolved,fixed in (2014-11): 4

Number of open and stalled tasks in total: 17463

Median age in days of open tasks by priority:
Unbreak now: 130
Needs Triage: 143
High: 206
Normal: 444
Low: 683
Needs Volunteer: 545

TODO: Numbers which refer to closed tasks might not be correct, as described in T1003.

Yours sincerely,
Fab Rick Aytor

(via community_metrics.sh on iridium at Tue Dec 16 15:50:10 UTC 2014)
In T1003#850735, @Dzahn wrote:

let me know if you don't receive it daily from now on

Uh, daily? The idea was monthly (see topic), otherwise the queries using INTERVAL 1 MONTH would likely create some weird results. :)

--> https://gerrit.wikimedia.org/r/#/c/180444/

can I get on this list?

You need to bribe mutante.

No, there's no need for bribes. It's an exim alias defined in the private puppet repo.

@palladium:~/private/modules/privateexim/files/wikimedia.org

commit 38f50add7650b9ab524a7a7202bf6a3226a9d695

done: dzahn: add Chase to communitymetrics@ alias

but using $sql_name in the script won't work here as that is on the "phabricator_user" DB instead.

Access denied for user 'phstats'@... The phstats user (and also the phuser user) don't have access to the "phabricator_user" database, only to the "phabricator_maniphest" database. This requires a change in DB GRANTS.

Dzahn mentioned this in Unknown Object (Diffusion Commit).Dec 17 2014, 11:31 AM

Uh, daily? The idea was monthly (see topic), otherwise the queries using INTERVAL 1 MONTH would likely create some weird results. :)

The cron job is changed now to run on the 1st of every month:

Puppet Name: phabstatscron_communitymetrics
0 0 1 * * /srv/phab/tools/community_metrics.sh

The third field is "day of the month".

In T1003#852987, @Dzahn wrote:

Access denied for user 'phstats'@... [...] This requires a change in DB GRANTS.

@Dzahn: Does that require a separate ticket, or how to proceed?

@Dzahn: Does that require a separate ticket, or how to proceed?

It does, but it already exists. I reopened T78311 with the additional requirement and it's linked here as a blocker.

Change 181564 had a related patch set uploaded (by Dzahn):
phab metrics: add number of accounts created

https://gerrit.wikimedia.org/r/181564

Patch-For-Review

Change 181564 merged by Dzahn:
phab metrics: add number of accounts created

https://gerrit.wikimedia.org/r/181564

Dzahn mentioned this in Unknown Object (Diffusion Commit).Dec 23 2014, 1:37 PM

Quim said: ( for some reason I can't hit "quote" on his comment)

Is it possible to add these to the report?
Number of accounts created

This has been added now with https://gerrit.wikimedia.org/r/#/c/181564/

"Number of accounts created in (2014-11): 299"

resolving per: "This task can be closed as soon as we can publish monthly metrics for total amount of users and active in the past month."

Dzahn removed Dzahn as the assignee of this task.Dec 23 2014, 1:43 PM

The report send on January 1 looks good! Thank you very much. I have forwarded it to wikitech-l, wondering whether there is a way to sending it there automatically.

Also, would it possible to generate one report for November 2014? It's the month of the Bugzilla migration and any data needs to be taken with a big grain of salt. Still, having Bugzilla data until October and Phabricator data from December, it would be good to cover November for posterity.

Is it possible to get teamwise statistics? (eg: Language Engineering) (Just wishlist) :)

Is it possible to get teamwise statistics? (eg: Language Engineering) (Just wishlist) :)

We are no thee yet (at all), but see T28.

In T1003#961920, @Qgil wrote:

The report send on January 1 looks good! Thank you very much. I have forwarded it to wikitech-l, wondering whether there is a way to sending it there automatically.

Yes, i can send it to wikitech by just adding the list address as a recipient. But one of the list admins has to also list the sender address in list settins somewhere as a verified "non-member" or we have to subscribe a fake user, otherwise the list will reject mail from an address it doesn't recognize as a subscribed member.

Also, would it possible to generate one report for November 2014? It's the month of the Bugzilla migration and any data needs to be taken with a big grain of salt. Still, having Bugzilla data until October and Phabricator data from December, it would be good to cover November for posterity.

Yes, let me try to make one for November manually. I will attach it here.

In T1003#963724, @Dzahn wrote:

Yes, let me try to make one for November manually. I will attach it here.

I can give you this part:

Number of accounts created in (2014-11): 299
Number of active users (any activity) in (2014-11): 443
Number of task authors in (2014-11): 284
Number of users who have closed tasks in (2014-11): 115
Number of tasks created in (2014-11): 1916
Number of tasks closed in (2014-11): 814
Number of tasks in the shell project closed as resolved,fixed in (2014-11): 4

Number of open and stalled tasks in total: 18296

Is that sufficient? It was easier to make than the lower part.

I doubt the November numbers because "Number of open and stalled tasks in total" is 18296 while it was 18125 in December. This really goes into details how metadata was migrated so I would not use any November numbers. Number of tasks created in Dec is 3424 - Number of tasks closed in Dec is 2529 = +895, but 18125-18296 = -171. And as written in that script, the numbers are not entirely correctly for December either (and I encourage anybody who wants to investigate the SQL and Phab'sDB structure to fix that).

Is it possible to get teamwise statistics? (eg: Language Engineering) (Just wishlist) :)

Separate tickets with specific criteria can be created. Such tickets will be wishlist/needs volunteer though; I don't think that I should spend much more time fiddling with Phabricator database schemes and SQL queries.

As requested in comments above i have added wikitech-l to the recipients of this mail. More specifically wikitech-l is now one of the recipients that receive communitrymetrics@ mail which is the sender of this as configured in the puppet phabricator role.

Now we need one of the admins of wikitech-l to allow that sender as a "non-member". Admins of wikitech-l are demon, aklapper and platonides.

andre__ added the address as allowed sender to wikitech-l. That means the list should now receive stats starting on Feb 1st.

Thank you very much!

In T1003#963841, @Dzahn wrote:

Number of accounts created in (2014-11): 299
Number of active users (any activity) in (2014-11): 443

These are the two numbers I really care about for November. Look credible...

Number of task authors in (2014-11): 284
Number of users who have closed tasks in (2014-11): 115

These are also interesting. Do you think that they can be trusted?

It is no problem if the stats on tasks are messed up in the month of the Bugzilla migration. Even for December (with the RT migration) I'm not expecting to read much there in task related stats.

Having a continuity of user related stats is great. Thank you again.

Number of tasks created in (2014-11): 1916
Number of tasks closed in (2014-11): 814
Number of tasks in the shell project closed as resolved,fixed in (2014-11): 4

Number of open and stalled tasks in total: 18296

Is that sufficient? It was easier to make than the lower part.

I would like to propose some changes in the data ranges used in this report, but before let's agree on T86630: Consolidating time ranges across tech community metrics. You opinions are welcome there.

Trying to get some comparable numbers pre and post-migration. It seems that the sum of MediaWiki (including VisualEditor and MobileFrontend) and Wikimedia products is what mapped best from Bugzilla to Phabricator. Some numbers (I hate IRC invisible characters with passion, btw):

# Number of active users in MediaWiki and Wikimedia products
# January 2015
$ curl -s 'http://bots.wmflabs.org/~wm-bot/logs/%23mediawiki-feed/%23mediawiki-feed.tar.gz' | tar xzf - -O --wildcards '201501*' | grep wikibugs | grep -E '(MediaWiki|MobileFrontend|VisualEditor|Wikimedia)' | grep -Eo ' [(][^)]{0,3}10[^)]+' | sort | uniq -c | wc -l
473
# October 2014
$ ... --wildcards '201410*' ...
462

Hello! I just received the monthly stats email ([Wikitech-l] Phabricator monthly statistics - 2017-10) and I'm wondering if there's a way to show a 3 or 6 month trend. I do not know from the email if new Phab users have slowed down to a trickle or doubled...and the same holds true for active users and other metrics. This is a very very low-priority ask, but I would love to see additional stats added that compare the current figures to the past 3 or 6 months, so we can learn more about the ongoing nature of the userbase. Thank you! Mel

@MelodyKramer: I nowadays dump two of these numbers at meta:Technical_Collaboration/Metrics#Project_management_.28Phabricator.29.
(Beforehand I dumped two of these numbers at mw:Community_metrics but the page scope was too blurry so I split wikimedia.biterg.io documentation, displaying of some graphs with data copied out of other systems, and a link hub to random metrics pages in Wikimedia into three separate pages.)

(And the usual disclaimer when it comes to interpreting numbers: "New Phab accounts" does not mean "active Phab accounts" for more than half of these accounts.)

Ah, thanks @Aklapper - is it possible to get that link (or the information contained within) added to the monthly email update I receive? I'm not sure who mans the wheels on that.

@MelodyKramer: The file to edit is https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/phabricator/templates/community_metrics.sh.erb
If you feel like writing and proposing a patch in Gerrit: See https://www.mediawiki.org/wiki/Gerrit/Getting_started . The repo in this case is operations/puppet.git, the file is modules/phabricator/templates/community_metrics.sh.erb, and you'd have to replace the command git checkout master by git checkout production in this very case. And if you don't feel like I can also give it a shot. :)