Page MenuHomePhabricator

Monuments database dropped to 10% of its contents
Closed, ResolvedPublic

Description

It suddenly dropped from 1.5M to 119K.

Event Timeline

Mentioned in SAL (#wikimedia-cloud) [2017-11-17T19:30:26Z] <JeanFred> Started a new harvest to better investigate T180833

ERROR: Unknown error occurred when processing country gb-sct in lang en
(2006, "MySQL server has gone away (error(32, 'Broken pipe'))")
ERROR: Unknown error occurred when processing country es-vc in lang ca
(0, '')
ERROR: Unknown error occurred when processing country th in lang th
(0, '')
ERROR: Unknown error occurred when processing country es-ct in lang ca
(0, '')
ERROR: Unknown error occurred when processing country es in lang es
(0, '')
…

Mentioned in SAL (#wikimedia-cloud) [2017-11-17T21:31:33Z] <JeanFred> Reverted to old database replicas (via git reset HEAD~1 && git stash) as part of T180833 investigation

Same thing

ERROR: Unknown error occurred when processing country gb-sct in lang en
(2006, "MySQL server has gone away (error(32, 'Broken pipe'))")
ERROR: Unknown error occurred when processing country es-vc in lang ca
(0, '')
ERROR: Unknown error occurred when processing country th in lang th
(0, '')
ERROR: Unknown error occurred when processing country es-ct in lang ca
(0, '')
ERROR: Unknown error occurred when processing country es in lang es
(0, '')

Might be related to the move to pymysql :-/

Latest run − with myslqdb and old replicas − did succeed. Monuments DB is backed to 1.5M. Now, is that just coincidence...

Which did you revert for it to work?

So the reverted commit was https://gerrit.wikimedia.org/r/#/c/390895/.

@JeanFred Should we revert this in the repo as well (to get deployed code in sync with master again)

The issue has been resolved (monumenta are back). Cleanup and figuring out the underlying issue is part of T200101: Resolve usage of pymysql vs. MySQLdb.

I am re-opening this task, because the same thing happened again. Many of the countries "lost" all their monuments.

I am re-opening this task, because the same thing happened again. Many of the countries "lost" all their monuments.

Oh geez, indeed: https://commons.wikimedia.org/wiki/Special:Diff/345626245

Thanks for flagging this @Atsirlin − will investigate.

Retrieving 50 pages from wikipedia:de.
/mnt/nfs/labstore-secondary-tools-project/heritage/heritage/bin/update_monuments.sh: line 29: 31241 Killed                  $PYWIKIBOT_BIN $ERFGOED_PATH/update_database.py -fullupdate -log -skip_wd
/mnt/nfs/labstore-secondary-tools-project/heritage/heritage/bin/update_monuments.sh: line 32: jstop: command not found
2019-04-12_05:52:42 Update monuments_all table...
/mnt/nfs/labstore-secondary-tools-project/heritage/heritage/bin/update_monuments.sh: line 39: jsub: command not found
2019-04-12_05:53:42 Make statistics...
WARNING: /data/project/heritage/heritage/erfgoedbot/database_statistics.py:28: Warning: Truncated incorrect DOUBLE value: ''
  cursor.execute(query)

Page [[commons:Commons:Monuments database/Statistics]] saved

So update_database.py was killed somehow ; and then the Shell script went on to replace the database table anyways…

We should set a pipefail or smth so that the Python script being killed results in the entire Shell script being killed ; otherwise this is bound to happen.

(We likely will not be able to keep the shell script anyways, since it calls different runtime that need different containers anyways >_>)

Mentioned in SAL (#wikimedia-cloud) [2019-04-12T10:04:17Z] <JeanFred> Started a new harvest to better investigate T180833

This time it failed for ru: only. Could anyone take a look, please?

Aklapper subscribed.

Removing task assignee due to inactivity, as this open task has been assigned to the same person for more than two years (see the emails sent to the task assignee on Oct27 and Nov23). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.
(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)

Atsirlin claimed this task.

Changing to Resolved, because the database is working fine now