internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org
Closed, ResolvedPublic
Actions

Description

I am using commons_upload to upload files using the API.

It works fine when targeting MediaWiki-Vagrant or production Commons, but it fails when targeting beta cluster.

$ ls screenshots/
VisualEditor_toolbar-ta.png

vagrant (works fine)

$ export MEDIAWIKI_API_UPLOAD_URL=http://127.0.0.1:8080/w/api.php
$ export MEDIAWIKI_USER=Admin
$ export MEDIAWIKI_PASSWORD=vagrant
$ bundle exec upload
Uploading ./screenshots/VisualEditor_toolbar-ta.png

production (works fine)

$ export MEDIAWIKI_API_UPLOAD_URL=https://commons.wikimedia.org/w/api.php
$ export MEDIAWIKI_USER=LanguageScreenshotBot
$ export MEDIAWIKI_PASSWORD=not-the-real-one
$ bundle exec upload
Uploading ./screenshots/VisualEditor_toolbar-ta.png

beta (fails)

$ export MEDIAWIKI_API_UPLOAD_URL=https://commons.wikimedia.beta.wmflabs.org/w/api.php
$ export MEDIAWIKI_USER=LanguageScreenshotBot
$ export MEDIAWIKI_PASSWORD=not-the-real-one
$ bundle exec upload
Uploading ./screenshots/VisualEditor_toolbar-ta.png
/usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:211:in `send_request': [V7wzDQpEEH8AAHb1ERgAAAAD] Exception Caught: Could not acquire lock for "Array". (internal_api_error_LocalFileLockError) (MediawikiApi::ApiError)
	from /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:232:in `raw_action'
	from /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:36:in `action'
	from /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:166:in `upload_image'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:34:in `image'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:45:in `block in images'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:43:in `each'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:43:in `images'
	from /Users/z/Documents/github/amire80/commons_upload/bin/upload:4:in `<top (required)>'
	from /usr/local/lib/ruby/gems/2.3.0/bin/upload:23:in `load'
	from /usr/local/lib/ruby/gems/2.3.0/bin/upload:23:in `<main>'

I have found V7wzDQpEEH8AAHb1ERgAAAAD in the logs.

{
  "_index": "logstash-2016.08.23",
  "_type": "mediawiki",
  "_id": "AVa3J5W_hLHGnzHlEuTQ",
  "_score": 1,
  "_source": {
    "message": "Failed to lock 'VisualEditor_toolbar-ta.png'",
    "@version": 1,
    "@timestamp": "2016-08-23T11:27:19.000Z",
    "type": "mediawiki",
    "host": "deployment-mediawiki02",
    "level": "WARNING",
    "tags": [
      "syslog",
      "es",
      "es"
    ],
    "channel": "LocalFile",
    "normalized_message": "Failed to lock '{file}'",
    "url": "/w/api.php",
    "ip": "10.68.18.103",
    "http_method": "POST",
    "server": "commons.wikimedia.beta.wmflabs.org",
    "referrer": null,
    "unique_id": "V7wzDQpEEH8AAHb1ERgAAAAD",
    "wiki": "commonswiki",
    "mwversion": "1.28.0-alpha",
    "reqId": "V7wzDQpEEH8AAHb1ERgAAAAD",
    "file": "VisualEditor_toolbar-ta.png"
  },
  "fields": {
    "@timestamp": [
      1471951639000
    ]
  }
}

Related Objects
Search...

Status	Assigned	Task
Resolved	zeljkofilipin	T94150 Fix scenarios that fail at en.wikipedia.beta.wmflabs.org or do not run them daily
Resolved	zeljkofilipin	T94162 Fix VisualEditor scenarios that fail at en.wikipedia.beta.wmflabs.org or do not run them daily
Invalid	None	T138751 Update VisualEditor user guide
Resolved	Esanders	T136675 Make new screenshots to show off the LaTeX editor
Resolved	Esanders	T119482 Fix language screenshots browser test on VisualEditor
Resolved	zeljkofilipin	T132574 Port language screenshots for VisualEditor from Ruby to JavaScript
Resolved	zeljkofilipin	T139613 Run language screenshots script for VisualEditor in Jenkins
Resolved	zeljkofilipin	T143563 Upload language screenshot files using commons_upload ruby gem
Resolved	zeljkofilipin	T143655 internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org

Event Timeline

zeljkofilipin created this task.Aug 23 2016, 11:53 AM

Restricted Application added subscribers: Poyekhali, Aklapper. · View Herald TranscriptAug 23 2016, 11:53 AM

zeljkofilipin added a project: Beta-Cluster-Infrastructure.Aug 23 2016, 11:53 AM

Anomie moved this task from Unsorted to Non-core-API stuff on the MediaWiki-Action-API board.Aug 23 2016, 1:23 PM

Anomie added a project: MediaWiki-Uploading.Aug 23 2016, 1:26 PM

Restricted Application added a project: Multimedia. · View Herald TranscriptAug 23 2016, 1:26 PM

Looking at other log messages in that request (https://logstash-beta.wmflabs.org/goto/03d19a17b4404c7708e2782fb44cbe59), I see a few things that could use checking:

A lot of "Lua script error on server "10.68.16.231": LOADING Redis is loading the dataset in memory", which if it's using redis for locking might mean the locking failed because redis happened to be restarting at that moment. You might just try again.
An error message "Could not acquire lock for "Array".", which could mean something somewhere in there is getting an array where it expected a string. If it's trying to lock "Array" for everything rather than the correct file paths, that could be trouble.

Restricted Application added a subscriber: Matanya. · View Herald TranscriptAug 23 2016, 1:51 PM

In T143655#2575545, @Anomie wrote:

A lot of "Lua script error on server "10.68.16.231": LOADING Redis is loading the dataset in memory", which if it's using redis for locking might mean the locking failed because redis happened to be restarting at that moment. You might just try again.

It did not happen just once, it is reproducible every time. Sorry, forgot to make that explicit.

Do the reproductions lack the "Redis is loading" messages during the request?

Sorry, I am not familiar with logstash. How did I get something like https://logstash-beta.wmflabs.org/goto/03d19a17b4404c7708e2782fb44cbe59 for another request?

In the search bar near the top of the page, change the V7wzDQpEEH8AAHb1ERgAAAAD in "reqId:V7wzDQpEEH8AAHb1ERgAAAAD" to whatever the reqId is for your retry request.

Tried again.

$ ls screenshots/
VisualEditor_category_item-hr.png

$ export MEDIAWIKI_API_UPLOAD_URL=https://commons.wikimedia.beta.wmflabs.org/w/api.php
$ export MEDIAWIKI_USER=LanguageScreenshotBot
$ export MEDIAWIKI_PASSWORD=not-the-real-one

$ bundle exec upload
Uploading ./screenshots/VisualEditor_category_item-hr.png
/usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:211:in `send_request': [V71kiwpEEaoAAEaiEmoAAAAF] Exception Caught: Could not acquire lock for "mwstore&#58;//local-multiwrite/local-public/d/de/VisualEditor_category_item-hr.png". (internal_api_error_LocalFileLockError) (MediawikiApi::ApiError)
	from /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:232:in `raw_action'
	from /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:36:in `action'
	from /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:166:in `upload_image'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:34:in `image'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:45:in `block in images'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:43:in `each'
	from /Users/z/Documents/github/amire80/commons_upload/lib/commons_upload.rb:43:in `images'
	from /Users/z/Documents/github/amire80/commons_upload/bin/upload:4:in `<top (required)>'
	from /usr/local/lib/ruby/gems/2.3.0/bin/upload:23:in `load'
	from /usr/local/lib/ruby/gems/2.3.0/bin/upload:23:in `<main>'

https://logstash-beta.wmflabs.org/goto/0c416dc6114efee609ed469aa4ded6c3

zeljkofilipin added a comment.Aug 24 2016, 9:19 AM

This comment was removed by zeljkofilipin.

Date filter prevent that reqId from being shown. Change it to "Last 7 days" and it finds it: https://logstash-beta.wmflabs.org/goto/b2d8adc1f8c4ec2b673a5a1fdc30e2c6 :-}

Thanks @hashar, did not notice time filter.

zeljkofilipin moved this task from Backlog 🪒 to Deep work 🌊 on the User-zeljkofilipin board.Aug 24 2016, 9:25 AM

On beta in wmf-config/fileback-labs.php the lock manager is configured line 157 via $wgLockManagers[] which in turns point to:

wmf-config/LabsServices.php

'redis_lock' => [
    'rdb1' => '10.68.16.177', // deployment-redis01.deployment-prep.eqiad.wmflabs
    'rdb2' => '10.68.16.231', // deployment-redis02.deployment-prep.eqiad.wmflabs
],

The reqId query in logstash shows a bunch of:

Lua script error on server "10.68.16.231": LOADING Redis is loading the dataset in memory

Maybe that is just a warning. Then looking at 10.68.16.231 which is deployment-redis02.deployment-prep.eqiad.wmflabs , looks like redis has some issue:

[675072.671326] init: redis-instance-tcp_6379 main process (31402) terminated with status 1
[675076.660993] init: redis-instance-tcp_6379 main process ended, respawning
[675076.660975] init: redis-instance-tcp_6379 main process (31407) terminated with status 1
[675080.677410] init: redis-instance-tcp_6379 main process ended, respawning
....

Mentioned in SAL [2016-08-24T09:38:10Z] <hashar> deployment-redis02 initctl stop redis-instance-tcp_6379 && initctl start redis-instance-tcp_6379 | That did not fix it magically though T143655

On deployment-redis02:

/var/log/redis/tcp_6379.log

[1627] 24 Aug 09:41:18.216 # Unable to set the max number of files limit to 10032 (Operation not permitted), setting the max clients configuration to 3984.
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 2.8.4 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 1627
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

[1627] 24 Aug 09:41:18.218 # Server started, Redis version 2.8.4
[1627] 24 Aug 09:41:22.102 # Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix <filename>

I have no idea what AOF stands for nor which AOF file to fix up :(

Mentioned in SAL [2016-08-24T09:43:51Z] <hashar> T143655 stopping redis 6379 on deployment-redis02 : initctl stop redis-instance-tcp_6379

Found the config in /etc/redis/tcp_6379.conf which has the AOF set to /srv/redis/deployment-redis02-6379.aof

# sudo su - redis -s /bin/bash
$ redis-check-aof /srv/redis/deployment-redis02-6379.aof
0x        1d4509a4: Expected prefix '
AOF analyzed: size=491065344, ok_up_to=491063716, diff=1628
AOF is not valid

$ redis-check-aof --fix /srv/redis/deployment-redis02-6379.aof
0x        1d4509a4: Expected prefix '
AOF analyzed: size=491065344, ok_up_to=491063716, diff=1628
This will shrink the AOF from 491065344 bytes, with 1628 bytes, to 491063716 bytes
Continue? [y/N]: y
Successfully truncated AOF

$ redis-check-aof /srv/redis/deployment-redis02-6379.aof
AOF analyzed: size=491063716, ok_up_to=491063716, diff=0
AOF is valid

Restarted it with initctl start redis-instance-tcp_6379 and the log file shows:

[2850] 24 Aug 09:48:01.076 # Unable to set the max number of files limit to 10032 (Operation not permitted), setting the max clients configuration to 3984.
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 2.8.4 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 2850
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

[2850] 24 Aug 09:48:01.077 # Server started, Redis version 2.8.4
[2850] 24 Aug 09:48:05.050 * DB loaded from append only file: 3.972 seconds
[2850] 24 Aug 09:48:05.051 * The server is now ready to accept connections on port 6379

Mentioned in SAL [2016-08-24T09:50:33Z] <hashar> deployment-redis02 fixed AOF file /srv/redis/deployment-redis02-6379.aof and restarted the redis instance should fix T143655 and might help T142600

Stashbot mentioned this in T142600: Various browser tests failing with MediawikiApi::LoginError.Aug 24 2016, 9:50 AM

Thanks @hashar, the problem seems to be fixed! :)

This instance probably crashed or something recently - I caught it failing puppet due to apt-get failures, because something hadn't been able to fully install.

Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:42 PM

internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.orgClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org
Closed, ResolvedPublic
Actions

Related Objects
Search...