Page MenuHomePhabricator

Adding a New Item Manually or Using a Bot After Importing a Dump
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • 1. I produced an XML dump from my Wikibase instance using [[ https://github.com/mediawiki-client-tools/mediawiki-dump-generator | dumpgenerator ]].
  • 2. I followed the directions here (installing Docker, checking out the files, preparing the files, and customizing the environment
  • 4. I ran the following Bash script with no issues:
# Load config.ini
source config.ini

cd "$WBDOCKERPATH"

# Run docker-compose to set up Wikibase Suite instance.
docker-compose -f docker-compose.yml -f docker-compose.extra.yml up -d

# Run update and install packages.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "apt-get -y update && apt-get -y install vim && apt-get -y install python3 && apt-get -y install python3-pip"

# Update for pip.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "python3 -m pip config --global set global.break-system-packages true"

# Clone WikibaseQualityConstraints, making sure to use the REL1_41 branch.
if [ ! -d "$WBWRAPPERPATH/WikibaseQualityExtensions" ]; then
    git clone -b REL1_41 https://gerrit.wikimedia.org/r/p/mediawiki/extensions/WikibaseQualityConstraints.git "$WBWRAPPERPATH/WikibaseQualityExtensions"
fi
docker cp "$WBWRAPPERPATH/WikibaseQualityExtensions" wbdocker-wikibase-1:/var/www/html/extensions/WikibaseQualityConstraints

# Clone WikibaseLexeme, making sure to use the REL1_41 branch.
if [ ! -d "$WBWRAPPERPATH/WikibaseLexeme" ]; then
    git clone -b REL1_41 https://gerrit.wikimedia.org/r/p/mediawiki/extensions/WikibaseLexeme.git "$WBWRAPPERPATH/WikibaseLexeme"
fi
docker cp "$WBWRAPPERPATH/WikibaseLexeme" wbdocker-wikibase-1:/var/www/html/extensions/WikibaseLexeme

# Load WikibaseQualityConstraints.
if winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "grep -Fxq \"wfLoadExtension( 'WikibaseQualityConstraints' );\" /var/www/html/LocalSettings.php";
then
    :
else
    winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "echo \"wfLoadExtension( 'WikibaseQualityConstraints' );\" >> /var/www/html/LocalSettings.php"
fi

# Load WikibaseLexeme.
if winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "grep -Fxq \"wfLoadExtension( 'WikibaseLexeme' );\" /var/www/html/LocalSettings.php";
then
    :
else
    winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "echo \"wfLoadExtension( 'WikibaseLexeme' );\" >> /var/www/html/LocalSettings.php"
fi

# Run update script.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "php /var/www/html/maintenance/update.php"

# Copy over XML dump to upload.
docker cp "$XMLDUMPPATH" wbdocker-wikibase-1:/var/tmp/dump.xml

# Update LocalSettings.php to allow for entity import.
if winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "grep -Fxq \"\$wgWBRepoSettings['allowEntityImport'] = true;\" /var/www/html/LocalSettings.php";
then
    :
else
    winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "echo '$'\"wgWBRepoSettings['allowEntityImport'] = true;\" >> /var/www/html/LocalSettings.php"
fi
  • 5. Next I manually created a bot account.
  • 6. Then I ran this Bash script:
# Load config.ini
source config.ini

cd "$WBDOCKERPATH"

# Create an entity and then delete it using this script.
winpty python "$WBWRAPPERPATH/o2wb/create_and_destroy.py" -c "$WBWRAPPERPATH/config.json"

# Upload XML dump.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "php /var/www/html/maintenance/importDump.php < /var/tmp/dump.xml"

# Run rebuild script.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "php /var/www/html/maintenance/rebuildall.php"

# Run jobs script.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "php /var/www/html/maintenance/runJobs.php --memory-limit 512M"

# Run site statistics script.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "php /var/www/html/maintenance/initSiteStats.php --update"

# Download the SQL script to rebuild Wikibase identifiers.
# Originates from: https://www.wikibase.consulting/transferring-wikibase-data-between-wikis/
curl https://gist.githubusercontent.com/JeroenDeDauw/c86a5ab7e2771301eb506b246f1af7a6/raw/rebuildWikibaseIdCounters.sql -o rebuildWikibaseIdCounters.sql
docker cp "$WBWRAPPERPATH/rebuildWikibaseIdCounters.sql" wbdocker-wikibase-1:/var/www/html/maintenance/rebuildWikibaseIdCounters.sql

# Run SQL script to rebuild Wikibase identifiers.
winpty docker exec -it wbdocker-wikibase-1 //bin//bash -c "php /var/www/html/maintenance/sql.php /var/www/html/maintenance/rebuildWikibaseIdCounters.sql"

# Create an entity and then delete it using this script.
winpty python "$WBWRAPPERPATH/o2wb/create_and_destroy.py" -c "$WBWRAPPERPATH/config.json"

The Python script is written as follows:

#!/usr/bin/env python

import argparse
import json
import wikibaseintegrator

from wikibaseintegrator import WikibaseIntegrator
from wikibaseintegrator.wbi_config import config as wbi_config
from wikibaseintegrator import wbi_login

def main():
    # Create the parser.
    parser = argparse.ArgumentParser()

    # Add argument.
    parser.add_argument("-c", '--config', type=str, required=True)

    # Parse argument.
    args = parser.parse_args()

    # Read in configuration file.
    f = open(args.config)
    data = json.load(f)
    f.close()

    # Configure instance.
    wbi_config['MEDIAWIKI_API_URL'] = data['MEDIAWIKI_API_URL']
    wbi_config['SPARQL_ENDPOINT_URL'] = data['SPARQL_ENDPOINT_URL']
    wbi_config['WIKIBASE_URL'] = data['WIKIBASE_URL']

    # Log in.
    login_instance = wbi_login.Login(user=data['USER'], password=data['PASSWORD'])

    # Instantiate.
    wbi = WikibaseIntegrator(login=login_instance)

    # Start new item.
    entity = wbi.item.new()
    entity.aliases.set('en', 'Test123')
    entity.descriptions.set('en', 'A test entity.')

    # Write new item.
    entity.write()

    # Get item QID.
    entity_qid = str(entity.id)

    # Get item.
    entity = wbi.item.get(entity_qid)

    # Check JSON response.
    entity_json = entity.get_json()
    success_1 = False
    success_2 = False
    if 'descriptions' in entity_json:
        if 'en' in entity_json['descriptions']:
            if 'value' in entity_json['descriptions']['en']:
                if entity_json['descriptions']['en']['value'] == 'A test entity.':
                    success_2 = True
    if 'aliases' in entity_json:
        if 'en' in entity_json['aliases']:
            if len(entity_json['aliases']['en']) > 0:
                if 'value' in entity_json['aliases']['en'][0]:
                    if entity_json['aliases']['en'][0]['value'] == 'Test123':
                        success_1 = True

    if success_1 and success_2:
        print('Entity successfully created.')
    else:
        print('Entity not found. Creation failed.')

    # Delete item.
    entity.delete()
    try:
        entity = wbi.item.get(entity_qid)
        print("Entity deletion unsuccessful. Something went wrong. Exiting...")
        exit()
    except wikibaseintegrator.wbi_exceptions.MissingEntityException:
        print("Entity deletion successful. Exiting...")
        exit()

if __name__=="__main__": 
    main()

What happens?:

The Python script succeeds in creating and destroying an item at the beginning of the Bash script. It also seems to succeed if it creates and destroys an item if run before runJobs.php in the second script. It fails if run after runJobs.php; it fails if run after initSiteStats.php, and it fails if run after sql.php.

In all circumstances of failure, the Python script fails with the same error: wikibaseintegrator.wbi_exceptions.MWApiError: 'The save has failed.'

In all circumstances of failure, when attempting to create a test item manually, the GUI provides the error Could not create a new page. It already exists. despite manual checks, the page does not exist.

If the Python script is removed entirely and I try to create a new item manually after the import, I also get the error Could not create a new page. It already exists..

What should have happened instead?:

I should be able to add new items, using a bot and manually, after importing an XML dump.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

MediaWiki 1.41.1
PHP 8.1.28 (apache2handler)
ICU 72.1
MariaDB 10.11.8-MariaDB-ubu2204
Elasticsearch 7.10.2
Pygments 2.16.1

Other information (browser name/version, screenshots, etc.):

Tested on Chrome, Firefox, Internet Explorer, and Edge. Same results in GUI.

I posted about this issue in the Wikibase community Telegram chat yesterday as well (20 Jun. 2024).

Event Timeline

Adding output of SQL script (php /var/www/html/maintenance/sql.php /var/www/html/maintenance/rebuildWikibaseIdCounters.sql), as mentioned on Telegram:

stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 1
)

Query OK, 1 row(s) affected
Query OK, 1 row(s) affected
stdClass Object
(
    [id_type] => 21849
    [id_value] => 0
)
stdClass Object
(
    [id_type] => 555
    [id_value] => 0
)
stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 1
)

Output of initSiteStats.php:

Refresh Site Statistics

Counting total edits...138761
Counting number of articles...0
Counting total pages...25605
Counting number of users...2
Counting number of images...0

Updating site statistics...done.

Done.

Managed to get together (essentially) a full log of everything in the second script above, just in case it's helpful. Unfortunately, it's too large to upload the whole thing, so here are the highlights:

STEP 1 # Running creation/deletion test script.
Entity successfully created.
Entity deletion successful. Exiting...

STEP 2 # php /var/www/html/maintenance/sql.php /var/www/html/maintenance/rebuildWikibaseIdCounters.sql

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

stdClass Object
(
    [id_type] => 0
    [id_value] => 0
)
stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 1
)

Query OK, 1 row(s) affected
Query OK, 1 row(s) affected
stdClass Object
(
    [id_type] => 0
    [id_value] => 0
)
stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 1
)

STEP 3 # Running creation/deletion test script.
Entity successfully created.
Entity deletion successful. Exiting...

STEP 4 # Upload XML dump.

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

100 (1.99 pages/sec 20.10 revs/sec)
...
25600 (8.13 pages/sec 44.07 revs/sec)
Done!
You might want to run rebuildrecentchanges.php to regenerate RecentChanges,
and initSiteStats.php to update page and revision counts

STEP 5 # Running creation/deletion test script. [Nothing printed here; since this logfile is stdout, it printed to stderr]

STEP 6 # php /var/www/html/maintenance/rebuildall.php

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

** Rebuilding fulltext search index (if you abort this will break searching; run this script again to fix)
Dropping index...
Clearing searchindex table...Done
Rebuilding index fields for 25607 pages...
500
...
25500

Rebuild the index...
Done.


** Rebuilding recentchanges table:
Rebuilding $wgRCMaxAge=31536000 seconds (365 days)
Clearing recentchanges table for time range...
Loading from page and revision tables...
Inserting from page and revision tables...
Updating links and size differences...
Loading from user and logging tables...
Flagging bot account edits...
Flagging auto-patrolled edits...
Removing duplicate revision and logging entries...
Deleting feed timestamps.
Done.



** Rebuilding links tables -- this can take a long time. It should be safe to abort via ctrl+C if you get bored.
Refreshing links from pages...
Estimated page count: 25008
100
...
25600
Deleting illegal entries from the links tables...
  Checking interval (-INF, INF)
    pagelinks: 0 deleted.
    imagelinks: 0 deleted.
    categorylinks: 0 deleted.
    templatelinks: 0 deleted.
    externallinks: 0 deleted.
    iwlinks: 0 deleted.
    langlinks: 0 deleted.
    redirect: 0 deleted.
    page_props: 0 deleted.
Done.

STEP 7 # Running creation/deletion test script. [Nothing printed here; since this logfile is stdout, it printed to stderr]

STEP 8 # php /var/www/html/maintenance/runJobs.php --memory-limit 512M

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

2024-06-21 21:49:33 recentChangesUpdate Special:RecentChanges type=cacheUpdate namespace=-1 title=RecentChanges requestId=9b4f9955b1d1c35bc8774d28 (id=9,timestamp=20240621204754) STARTING
...
2024-06-21 22:56:55 cirrusSearchElasticaWrite Special: method=sendData arguments=["general",[{"data":{"version":17503,"wiki":"my_wiki","page_id":2574,"namespace":146,"namespace_text":"Lexeme","title":"L363","timestamp":"2024-04-05T22:47:04Z","create_timestamp":"2023-09-29T15:52:42Z","redirect":[],"incoming_links":0},"params":{"_id":"2574","_index":"","_cirrus_hints":{"BuildDocument_flags":0,"noop":{"version":"documentVersion","incoming_links":"within 20%"}}},"upsert":true}]] cluster=default jobqueue_partition=default-0 update_kind=page_refresh root_event_time=1719003251 createdAt=1719008486 errorCount=0 retryCount=0 requestId=057067d6f37e8458371157dc namespace=-1 title= (id=310825,timestamp=20240621222126) t=8 good^M

STEP 9 # php /var/www/html/maintenance/initSiteStats.php --update

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

Refresh Site Statistics

Counting total edits...138762
Counting number of articles...0
Counting total pages...25605
Counting number of users...2
Counting number of images...0

Updating site statistics...done.

Done.

STEP 10 # Running creation/deletion test script. [Nothing printed here; since this logfile is stdout, it printed to stderr]

STEP 11

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

stdClass Object
(
    [id_type] => 0
    [id_value] => 0
)
stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 4
)

Query OK, 1 row(s) affected
Query OK, 1 row(s) affected
stdClass Object
(
    [id_type] => 0
    [id_value] => 0

STEP 12 # Running creation/deletion test script. [Nothing printed here; since this logfile is stdout, it printed to stderr]

The stderr is the same in all instances, here's the full trace:

Error while writing to the Wikibase instance
Traceback (most recent call last):
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\entities\baseentity.py", line 243, in _write
    json_result: dict = edit_entity(data=data, id=entity_id, type=self.type, summary=summary, clear=clear, is_bot=is_bot, allow_anonymous=allow_anonymous,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\wbi_helpers.py", line 333, in edit_entity
    return mediawiki_api_call_helper(data=params, is_bot=is_bot, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\wbi_helpers.py", line 215, in mediawiki_api_call_helper
    return mediawiki_api_call('POST', mediawiki_api_url=mediawiki_api_url, session=session, data=data, headers=headers, max_retries=max_retries, retry_after=retry_after, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\wbi_helpers.py", line 129, in mediawiki_api_call
    raise MWApiError(json_data['error'])
wikibaseintegrator.wbi_exceptions.MWApiError: 'The save has failed.'
Traceback (most recent call last):
  File "C:\Users\[USERNAME]\Documents\Research Projects\[DBNAME]\wbdocker-wrapper\o2wb\create_and_destroy.py", line 98, in <module>
    main()
    ^^^^^^
  File "C:\Users\[USERNAME]\Documents\Research Projects\[DBNAME]\wbdocker-wrapper\o2wb\create_and_destroy.py", line 58, in main
    entity.write()
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\entities\item.py", line 166, in write
    json_data = super()._write(data=self.get_json(), **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\entities\baseentity.py", line 243, in _write
    json_result: dict = edit_entity(data=data, id=entity_id, type=self.type, summary=summary, clear=clear, is_bot=is_bot, allow_anonymous=allow_anonymous,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\wbi_helpers.py", line 333, in edit_entity
    return mediawiki_api_call_helper(data=params, is_bot=is_bot, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\wbi_helpers.py", line 215, in mediawiki_api_call_helper
    return mediawiki_api_call('POST', mediawiki_api_url=mediawiki_api_url, session=session, data=data, headers=headers, max_retries=max_retries, retry_after=retry_after, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[USERNAME]\AppData\Local\anaconda3\Lib\site-packages\wikibaseintegrator\wbi_helpers.py", line 129, in mediawiki_api_call
    raise MWApiError(json_data['error'])
wikibaseintegrator.wbi_exceptions.MWApiError: 'The save has failed.'

Just in case, also posting output of first script here:

time="2024-06-22T08:48:06-04:00" level=warning msg="C:\\Users\\[USERNAME]\\Documents\\Research Projects\\[DBNAME]\\wbdocker\\docker-compose.yml: `version` is obsolete"
time="2024-06-22T08:48:06-04:00" level=warning msg="C:\\Users\\[USERNAME]\\Documents\\Research Projects\\[DBNAME]\\wbdocker\\docker-compose.extra.yml: `version` is obsolete"
 Network wbdocker_default  Creating
 Network wbdocker_default  Created
 Volume "wbdocker_elasticsearch-data"  Creating
 Volume "wbdocker_elasticsearch-data"  Created
 Volume "wbdocker_query-service-data"  Creating
 Volume "wbdocker_query-service-data"  Created
 Volume "wbdocker_mediawiki-mysql-data"  Creating
 Volume "wbdocker_mediawiki-mysql-data"  Created
 Volume "wbdocker_shared"  Creating
 Volume "wbdocker_shared"  Created
 Volume "wbdocker_quickstatements-data"  Creating
 Volume "wbdocker_quickstatements-data"  Created
 Container wbdocker-wdqs-1  Creating
 Container wbdocker-elasticsearch-1  Creating
 Container wbdocker-mysql-1  Creating
 Container wbdocker-wdqs-1  Created
 Container wbdocker-wdqs-proxy-1  Creating
 Container wbdocker-elasticsearch-1  Created
 Container wbdocker-mysql-1  Created
 Container wbdocker-wikibase-jobrunner-1  Creating
 Container wbdocker-wikibase-1  Creating
 Container wbdocker-wdqs-proxy-1  Created
 Container wbdocker-wdqs-frontend-1  Creating
 Container wbdocker-wikibase-jobrunner-1  Created
 Container wbdocker-wdqs-frontend-1  Created
 Container wbdocker-wikibase-1  Created
 Container wbdocker-quickstatements-1  Creating
 Container wbdocker-wdqs-updater-1  Creating
 Container wbdocker-quickstatements-1  Created
 Container wbdocker-wdqs-updater-1  Created
 Container wbdocker-elasticsearch-1  Starting
 Container wbdocker-mysql-1  Starting
 Container wbdocker-wdqs-1  Starting
 Container wbdocker-mysql-1  Started
 Container wbdocker-wikibase-1  Starting
 Container wbdocker-wikibase-jobrunner-1  Starting
 Container wbdocker-elasticsearch-1  Started
 Container wbdocker-wdqs-1  Started
 Container wbdocker-wdqs-proxy-1  Starting
 Container wbdocker-wikibase-jobrunner-1  Started
 Container wbdocker-wdqs-proxy-1  Started
 Container wbdocker-wdqs-frontend-1  Starting
 Container wbdocker-wikibase-1  Started
 Container wbdocker-quickstatements-1  Starting
 Container wbdocker-wdqs-updater-1  Starting
 Container wbdocker-wdqs-updater-1  Started
 Container wbdocker-wdqs-frontend-1  Started
 Container wbdocker-quickstatements-1  Started
Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB
]
Get:4 http://deb.debian.org/debian bookworm/main amd64 Packages [8786 kB]
27% [4 Packages 1342 kB/8786 kB 15%]                               134 kB/s 56s
What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/
/usr/bin/python3: No module named pip

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

*******************************************************************************
NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead!
      Running scripts directly has been deprecated in MediaWiki 1.40.
      It may not work for some (or any) scripts in the future.
*******************************************************************************

MediaWiki 1.41.1 Updater

Your composer.lock file is up to date with current dependencies!
Going to run database updates for my_wiki
Depending on the size of your database this may take a while!
Abort with control-c in the next five seconds (skip this countdown with --quick)
 ...0
...collations up-to-date.
...have rev_actor field in revision table.
...watchlist_expiry table already exists.
...page_restrictions field does not exist in page table, skipping modify field p
atch.
...index ipb_address_unique already set on ipblocks table.
...archive table does not contain ar_text_id field.
...lc_lang is up-to-date.
...ll_lang is up-to-date.
...site_language is up-to-date.
...index ipb_address_unique on table ipblocks has no field ipb_anon_only; added.
...ipb_address_unique index up-to-date.
...actor_name in table actor already modified by patch patch-actor-actor_name-va
rbinary.sql.
...site_global_key in table sites already modified by patch patch-sites-site_glo
bal_key.sql.
...iwl_prefix in table iwlinks already modified by patch patch-extend-iwlinks-iw
l_prefix.sql.
...rd_title in table redirect already modified by patch patch-redirect-rd_title-
varbinary.sql.
...pl_title in table pagelinks already modified by patch patch-pagelinks-pl_titl
e-varbinary.sql.
...tl_title field does not exist in templatelinks table, skipping modify field p
atch.
...il_to in table imagelinks already modified by patch patch-imagelinks-il_to-va
rbinary.sql.
...ll_title in table langlinks already modified by patch patch-langlinks-ll_titl
e-varbinary.sql.
...iwl_title in table iwlinks already modified by patch patch-iwlinks-iwl_title-
varbinary.sql.
...cat_title in table category already modified by patch patch-category-cat_titl
e-varbinary.sql.
...qc_title in table querycache already modified by patch patch-querycache-qc_ti
tle-varbinary.sql.
...qcc_title in table querycachetwo already modified by patch patch-querycachetw
o-qcc_title-varbinary.sql.
...wl_title in table watchlist already modified by patch patch-watchlist-wl_titl
e-varbinary.sql.
...user_last_timestamp in table user_newtalk already modified by patch patch-use
r_newtalk-user_last_timestamp-binary.sql.
...pt_title in table protected_titles already modified by patch patch-protected_
titles-pt_title-varbinary.sql.
...ir_type in table ipblocks_restrictions already modified by patch patch-ipbloc
ks_restrictions-ir_type.sql.
...index wl_namespace_title already set on watchlist table.
...job_title in table job already modified by patch patch-job-job_title-varbinar
y.sql.
...job_timestamp in table job already modified by patch patch-job_job_timestamp.
sql.
...job_token_timestamp in table job already modified by patch patch-job_job_toke
n_timestamp.sql.
...wl_notificationtimestamp in table watchlist already modified by patch patch-w
atchlist-wl_notificationtimestamp.sql.
...role_id in table slot_roles already modified by patch patch-slot_roles-role_i
d.sql.
...model_id in table content_models already modified by patch patch-content_mode
ls-model_id.sql.
...cl_to in table categorylinks already modified by patch patch-categorylinks-cl
_to-varbinary.sql.
...log_title in table logging already modified by patch patch-logging-log_title-
varbinary.sql.
...us_timestamp in table uploadstash already modified by patch patch-uploadstash
-us_timestamp.sql.
...index up_property already set on user_properties table.
...index site_global_key already set on sites table.
...index log_type_time already set on logging table.
...fa_name in table filearchive already modified by patch patch-filearchive-fa_n
ame.sql.
...oi_name in table oldimage already modified by patch patch-oldimage-oi_name-va
rbinary.sql.
...exptime in table objectcache already modified by patch patch-objectcache-expt
ime-notnull.sql.
...index ar_name_title_timestamp already set on archive table.
...img_name in table image already modified by patch patch-image-img_name-varbin
ary.sql.
...img_timestamp in table image already modified by patch patch-image-img_timest
amp.sql.
...index si_key already set on site_identifiers table.
...rc_title in table recentchanges already modified by patch patch-recentchanges
-rc_title-varbinary.sql.
...rc_timestamp in table recentchanges already modified by patch patch-recentcha
nges-rc_timestamp.sql.
...rc_id in table recentchanges already modified by patch patch-recentchanges-rc
_id.sql.
...index rc_new_name_timestamp already set on recentchanges table.
...ar_title in table archive already modified by patch patch-archive-ar_title-va
rbinary.sql.
...page_title in table page already modified by patch patch-page-page_title-varb
inary.sql.
...user_name in table user already modified by patch patch-user_table-updates.sq
l.
...index rev_page_timestamp already set on revision table.
...have modtoken field in objectcache table.
...index oi_timestamp already set on oldimage table.
...index page_name_title already set on page table.
...index ct_rc_tag_id already set on change_tag table.
...page_restrictions table does not contain pr_user field.
...fa_id in table filearchive already modified by patch patch-filearchive-fa_id.
sql.
...img_major_mime in table image already modified by patch patch-image-img_major
_mime-default.sql.
...linktarget table already exists.
...rev_page_id key doesn't exist.
...pr_page in table page_restrictions already modified by patch patch-page_restr
ictions-pr_page.sql.
...pp_page in table page_props already modified by patch patch-page_props-pp_pag
e.sql.
...ir_value in table ipblocks_restrictions already modified by patch patch-ipblo
cks_restrictions-ir_value.sql.
...have tl_target_id field in templatelinks table.
...user_autocreate_serial table already exists.
...ir_ipb_id in table ipblocks_restrictions already modified by patch patch-ipbl
ocks_restrictions-ir_ipb_id.sql.
...ipb_id in table ipblocks already modified by patch patch-ipblocks-ipb_id.sql.
...user_editcount in table user already modified by patch patch-user-user_editco
unt.sql.
Running maintenance/migrateRevisionActorTemp.php...
revision_actor_temp does not exist, so nothing to do.
done.
...revision_actor_temp doesn't exist.
Running maintenance/updateRestrictions.php...
Migration is not needed.
done.
...page table does not contain page_restrictions field.
...templatelinks table has already been migrated.
...tl_namespace field does not exist in templatelinks table, skipping modify fie
ld patch.
...templatelinks table does not contain tl_title field.
...have el_to_path field in externallinks table.
...have user_is_temp field in user table.
Running maintenance/migrateRevisionCommentTemp.php...
revision_comment_temp does not exist, so nothing to do.
done.
...revision_comment_temp doesn't exist.
Running maintenance/migrateExternallinks.php...
Old fields don't exist. There is no need to run this script
done.
...el_to field does not exist in externallinks table, skipping modify field patc
h.
...have pl_target_id field in pagelinks table.
...externallinks table does not contain el_to field.
Running maintenance/fixInconsistentRedirects.php...
Fixing inconsistent redirects ...
Estimated redirect page count: 1
0/0
Done, updated 0 of 0 rows.
done.
...img_size in table image already modified by patch patch-image-img_size_to_big
int.sql.
...fa_size in table filearchive already modified by patch patch-filearchive-fa_s
ize_to_bigint.sql.
...oi_size in table oldimage already modified by patch patch-oldimage-oi_size_to
_bigint.sql.
...us_size in table uploadstash already modified by patch patch-uploadstash-us_s
ize_to_bigint.sql.
...wb_changes table already exists.
...wb_id_counters table already exists.
...wb_items_per_site table already exists.
...ips_site_page in table wb_items_per_site already modified by patch /var/www/h
tml/extensions/Wikibase/repo/includes/Store/Sql/../../../sql/mysql/archives/Make
IpsSitePageLarger.sql.
...wb_ips_site_page key doesn't exist.
...change_info in table wb_changes already modified by patch /var/www/html/exten
sions/Wikibase/repo/includes/Store/Sql/../../../sql/mysql/archives/MakeChangeInf
oLarger.sql.
...wbt_text table already exists.
...wb_terms doesn't exist.
...wb_changes_change_type key doesn't exist.
...index change_object_id already set on wb_changes table.
...change_time in table wb_changes already modified by patch /var/www/html/exten
sions/Wikibase/repo/includes/Store/Sql/../../../sql/mysql/archives/patch-wb_chan
ges-change_timestamp.sql.
...wb_id_counters_type key doesn't exist.
...wb_changes_dispatch doesn't exist.
...wbc_entity_usage table does not contain eu_touched field.
...babel table already exists.
...babel_lang in table babel already modified by patch /var/www/html/extensions/
Babel/sql/babel-babel_lang-length-type.sql.
...babel_level in table babel already modified by patch /var/www/html/extensions
/Babel/sql/babel-babel_level-type.sql.
...entityschema_id_counter table already exists.
...oauth_registered_consumer table already exists.
...have oarc_oauth_version field in oauth_registered_consumer table.
...have oarc_oauth2_is_confidential field in oauth_registered_consumer table.
...have oarc_oauth2_allowed_grants field in oauth_registered_consumer table.
...have oaac_oauth_version field in oauth_accepted_consumer table.
...oauth2_access_tokens table already exists.
...index oaat_acceptance_id already set on oauth2_access_tokens table.
...oaac_accepted in table oauth_accepted_consumer already modified by patch /var
/www/html/extensions/OAuth/schema/mysql/patch-oauth_accepted_consumer-timestamp.
sql.
...oarc_email_authenticated in table oauth_registered_consumer already modified
by patch /var/www/html/extensions/OAuth/schema/mysql/patch-oauth_registered_cons
umer-timestamp.sql.
Creating wbqc_constraints table...done.
...have constraint_id field in wbqc_constraints table.
...index wbqc_constraints_guid_uniq already set on wbqc_constraints table.
...site_stats is populated...done.
Populating rev_len column
...doing rev_id from 1 to 200
Populating ar_len column
...archive table seems to be empty.
rev_len and ar_len population complete [0 revision rows, 0 archive rows].
Populating rev_sha1 column
...doing rev_id from 1 to 200
Populating ar_sha1 column
...archive table seems to be empty.
rev_sha1 and ar_sha1 population complete [0 revision rows, 0 archive rows].
Populating and recalculating img_sha1 field

Done 0 files in 0.0 seconds
Populating fa_sha1 field from fa_storage_key

Done 0 files in 0.0 seconds
Updating *_from_namespace fields in links tables.
...doing page_id from 1 to 200
Adding empty categories with description pages...
Removing empty categories without description pages...
Category cleanup complete.
Populating page_props.pp_sortkey...
Populating page_props.pp_sortkey complete.
Updated a total of 0 rows
Copying IP revisions to ip_changes, from rev_id 0 to rev_id 1
Attempted to insert 0 IP revisions, 0 actually done.
Purging caches...done.

Done in 0.3 s.

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/

What's next?
  Try Docker Debug for seamless, persistent debugging tools in any container or
image → docker debug wbdocker-wikibase-1
  Learn more at https://docs.docker.com/go/debug-cli/
(base)

Maybe it's possible that I'm not shutting down correctly when instantiating the Docker containers? I've been running docker-compose -f docker-compose.yml -f docker-compose.extra.yml down --volumes --remove-orphans between tests... Is there something else I should run? Maybe the mysql database persists somehow and is messing with the success of the SQL script? Is there a different or better way to cleanly shutdown the Docker components?

I thought maybe wikibase-lexeme was the issue since the SQL script didn't rebuild the counters for lexemes, so I added the following line: REPLACE INTO /*_*/wb_id_counters VALUE((SELECT COALESCE(MAX(CAST(SUBSTRING(page_title`, 2) AS UNSIGNED)), 0) FROM page WHERE page_namespace = 146), 'wikibase-lexeme');`. The file outputs the following:

stdClass Object
(
    [id_type] => 21849
    [id_value] => 0
)
stdClass Object
(
    [id_type] => 555
    [id_value] => 0
)
stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 5
)

Query OK, 1 row(s) affected
Query OK, 1 row(s) affected
Query OK, 1 row(s) affected
stdClass Object
(
    [id_type] => 21849
    [id_value] => 0
)
stdClass Object
(
    [id_type] => 3240
    [id_value] => 0
)
stdClass Object
(
    [id_type] => 555
    [id_value] => 0
)
stdClass Object
(
    [id_type] => wikibase-item
    [id_value] => 5
)

Unfortunately after running I still can't create a new item.

I think this may be an issue with the XML dump created by MediaWiki Dump Generator, so I also opened a ticket there, just in case (https://github.com/mediawiki-client-tools/mediawiki-dump-generator/issues/249).

Hi,

thanks a lot for the detailed report. Moving it into the sprint so we can analyze this shortly.

I've been running docker-compose -f docker-compose.yml -f docker-compose.extra.yml down --volumes --remove-orphans between tests... Is there something else I should run?

This should cleanly remove your volumes, so it also removes your MariaDB. You can double check after shutdown with docker volume ls.

Is there a chance to provide us with your data dump? I suspect the problem is not specific to your dump, but in case it is, it might make it easier for us to reproduce the issue.

Thanks a lot,
Best regards,
Robert

Robert,

Thanks so much for reaching out! I'll run another test shortly for the MariaDB just in case!

The dump I've been testing is available here: https://archive.org/details/wiki-lgbtdb.wikibase.cloud_w-20240604 (thanks to the folks at the MediaWiki Dump Generator for hosting, re: https://github.com/mediawiki-client-tools/mediawiki-dump-generator/issues/249).

Thank you so much for your help!

Just checked docker volume ls. Output when down is:

$ docker volume ls
DRIVER    VOLUME NAME

So it appears everything is gone when running docker-compose -f docker-compose.yml -f docker-compose.extra.yml down --volumes --remove-orphans.

Just in case, I ran a re-import right after checking this-- same issue: Could not create a new page. It already exists.. I also tested this with a dump from https://furry.wikibase.cloud/ (much smaller so testing ran faster; generated this dump using dumpgenerator https://furry.wikibase.cloud/ --xml --exnamespaces 640 to remove entity schemas and again with dumpgenerator https://furry.wikibase.cloud/ --xml --exnamespaces 640,146 to test without lexemes; same results for both: Could not create a new page. It already exists.).