Page MenuHomePhabricator

LocalisationUpdate not working since 2017-04-11
Closed, ResolvedPublic

Description

Last daily run of l10nupdate was on 2017-04-11 per Server Admin Log.

Event Timeline

19:36 <+thcipriani> well. FWIW, scap 3.5.5 was put on the servers on the 11th
19:37 <+thcipriani> which is https://github.com/wikimedia/scap/blob/release/debian/changelog#L1-L41

So with the new version of scap cache_git_info now runs on every sync (i.e., sync-file, sync-dir etc.) command rather than how it previously worked where it was only updated with a full scap sync T38271: [scap] Recompute and sync git version cache when sync-* are used.

In the logs (/var/log/l10nupdatelog/l10nupdate.log-20170424) it seems that's causing some l10nupdate breakage here:

02:13:49 Started cache_git_info
02:13:49 Finished cache_git_info (duration: 00m 00s)
02:13:49 Unhandled error:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/scap/cli.py", line 301, in run
    exit_status = app.main(app.extra_arguments)
  File "/usr/lib/python2.7/dist-packages/scap/main.py", line 620, in main
    return super(SyncL10n, self).main(*extra_args)
  File "/usr/lib/python2.7/dist-packages/scap/main.py", line 60, in main
    self._after_sync_common()
  File "/usr/lib/python2.7/dist-packages/scap/main.py", line 162, in _after_sync_common
    tasks.cache_git_info(version, self.config)
  File "/usr/lib/python2.7/dist-packages/scap/tasks.py", line 119, in cache_git_info
    with open(cache_file, 'w') as f:
IOError: [Errno 13] Permission denied: u'/srv/mediawiki-staging/php-1.29.0-wmf.20/cache/gitinfo/info.json'
02:13:49 sync-l10n failed: <IOError> [Errno 13] Permission denied: u'/srv/mediawiki-staging/php-1.29.0-wmf.20/cache/gitinfo/info.json'

It would be appropriate to revert scap to a known-working version.

There are other fixes included with that update for other uses of scap that are actively used by deploys multiple times per week, many times to fix production breakage quickly. Given that this breakage was not noticed for 2 weeks and that any needed updates to localization can happen via a SWAT until this is fixed, we won't revert scap to a previous state. Please do feel free to request an update during a SWAT if it is needed.

To be clear, we (RelEng) are actively looking into fixing this issue (sorry for not making that clear previously).

As a simple workaround, perhaps cache_git_info failures could be made non-fatal? In other words, set up scap such that if cache_git_info fails, it still proceeds with the rest of the sync.

See activity above ^^. Fix is ready and will roll out with the next scap release.

Seems to work again. Thank you!