Page MenuHomePhabricator

[Spike] Numbers of patrol actions in Wikidata and reasons behind it
Closed, ResolvedPublic

Description

99.01% of rows in logging table of Wikidata is only patrol actions but 50% of them are autopatrol. This doesn't make sense and needs to be investigated before moving forward.

Event Timeline

Ladsgroup created this task.

This is done. The reason is pretty obvious by seeing this table:

mysql:research@analytics-store.eqiad.wmnet [wikidatawiki]> select left(log_timestamp,6), count(*) from logging where log_type = 'patrol' and log_action != 'autopatrol' group by left(log_timestamp, 6);
+-----------------------+----------+
| left(log_timestamp,6) | count(*) |
+-----------------------+----------+
| 201210                |      931 |
| 201211                |   315267 |
| 201212                |  2179490 |
| 201301                |  2584194 |
| 201302                |  2747071 |
| 201303                |  9365166 |
| 201304                | 15039752 |
| 201305                | 15518277 |
| 201306                |  4635903 |
| 201307                |  9003285 |
| 201308                |  4794089 |
| 201309                |  5865640 |
| 201310                |  9059273 |
| 201311                |  7799062 |
| 201312                |  6539851 |
| 201401                |  9317568 |
| 201402                |  5922768 |
| 201403                |  5579434 |
| 201404                |  5512672 |
| 201405                | 11356184 |
| 201406                |  6105953 |
| 201407                |  6727928 |
| 201408                |  5959171 |
| 201409                |  6023303 |
| 201410                |  8753257 |
| 201411                |  8433379 |
| 201412                |  6795697 |
| 201501                |  7296656 |
| 201502                |  7469811 |
| 201503                |  8020426 |
| 201504                |  5422669 |
| 201505                |  7160377 |
| 201506                |  4610084 |
| 201507                | 12639598 |
| 201508                |  8656328 |
| 201509                |  7376620 |
| 201510                | 12343839 |
| 201511                | 11102836 |
| 201512                |  9546019 |
| 201601                | 10770807 |
| 201602                | 10084854 |
| 201603                |  8511197 |
| 201604                |    61832 |
| 201605                |    17048 |
| 201606                |     6726 |
| 201607                |     7135 |
| 201608                |    18273 |
| 201609                |     9928 |
| 201610                |    11828 |
| 201611                |    15226 |
| 201612                |    10220 |
| 201701                |    22360 |
| 201702                |    22743 |
| 201703                |    18654 |
| 201704                |    19703 |
| 201705                |    24596 |
| 201706                |    15471 |
| 201707                |    15659 |
| 201708                |    17062 |
| 201709                |    16964 |
| 201710                |    34329 |
| 201711                |    16142 |
| 201712                |    21531 |
| 201801                |    18878 |
| 201802                |    22566 |
| 201803                |      115 |
+-----------------------+----------+
66 rows in set (4 min 46.31 sec)

Before April 2016, we didn't distinguish between manual patrol and autopatrol (First entry that contains log_action = 'autopatrol' has the timestamp of 20160330190748). What happens next is really PM/Platform decision IMO

Reason and way forward are clear now.