Page MenuHomePhabricator

Unexpected values in labels for units
Closed, ResolvedPublic8 Estimated Story PointsBUG REPORT

Description

As a Wikibase.Cloud user, labels for units in properties that have the ‘quantity’ data type are showing up as the URI from wbstack instead of the labels from new quantity entities.

Example:
When visiting https://wikifcd.wikibase.cloud/wiki/Item:Q335904 I see the URI from wbstack instead of the labels from new quantity entities. Here is how it looked in wbstack: https://wikifcd.wiki.opencura.com/wiki/Item:Q335904

The impact of this issue is that it is confusing to people visiting item pages on WikiFCD. Due to the fact that most of our items have multiple statements using quantity properties, and many provide units, this makes the data more confusing to understand.

We believe the cause of this issue is that the entity data refers to URIs and in the process of our migration we did not rewrite any entity data.

AC:

  • replace the wbstack (opencura.com) URIs for units of quantities with the corresponding wikibase.cloud Item URI

Status of rebuilding these quantity statements with migrated units:

Broken SiteStatus
cocreate-cologne.wikibase.cloudin progress
zerowastecities.wikibase.cloudin progress
wikifcd.wikibase.cloudin progress
enrich-nfdi4culture.wikibase.cloudin progress
lod-working-group.wikibase.cloudin progress
tdwg-cd.wikibase.cloudin progress
sweopendata.wikibase.cloudin progress
ld4-2021-conference.wikibase.cloudin progress

note: in progress meaning: "The Wikibase maintenance script ran successfully for these wikis, but the issues aren't completely resolved" because at least qualifiers and references seem to not got fixed by it. Example: https://wikifcd.wikibase.cloud/wiki/Item:Q568345

**Update:

  • Tweak the script to pick up where it left of when it stops **

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 833742 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Add maintenance script to change units of quantities

https://gerrit.wikimedia.org/r/833742

I wrote a quick bash script to run a SPARQL query to check if a wiki is affected by this issue. I ran it against all wikis currently on wikibase.cloud and this is the list I collected:

cocreate-cologne.wikibase.cloud
zerowastecities.wikibase.cloud
wikifcd.wikibase.cloud
enrich-nfdi4culture.wikibase.cloud
lod-working-group.wikibase.cloud
tdwg-cd.wikibase.cloud
sweopendata.wikibase.cloud
ld4-2021-conference.wikibase.cloud

and indeed some of them are using quantity values. But it looks like not all of them, needs some more investigation.


Also there is this PR underway that will introduce the maintenance script to fix this problem: https://github.com/wbstack/mediawiki/pull/282

This code is now running in all environments; we starting running the maint script against https://wikifcd.wikibase.cloud on Friday but decided to hold off before the long weekend.

This has now started again.

Specifically we are running something like

$ time k exec mediawiki-137-fp-app-api-65cfc8d995-jtvnf -- bash -c 'MW_INSTALL_PATH=/var/www/html/w/ WBS_DOMAIN=wikifcd.wikibase.cloud php w/extensions/Wikibase/repo/maintenance/rebuildEntityQuantityUnit.php --from-value="http://wikifcd.wiki.opencura.com" --to-value="https://wikifcd.wikibase.cloud"'
Tarrow updated the task description. (Show Details)

last successful item adjusted was Q135866 before it crashed with command terminated with exit code 137

draft PR created for job; https://github.com/wmde/wbaas-deploy/pull/546 ; need to check this works locally but unfortunately my local setup is broken

Tried to run the job today, but it failed. Investigation needed.

This is the command that was used:

 $ WBS_DOMAIN=wikifcd.wikibase.cloud WBS_UNIT_FROM="http://wikifcd.wiki.opencura.com" WBS_UNIT_TO="https://wikifcd.wikibase.cloud" ./rebuildQuantityUnitsJob.sh 
job.batch/rebuild-quantity-units-btlqp created

These are the pods that got spun up:

rebuild-quantity-units-btlqp-4kxb8              0/1     Error       0          4h56m
rebuild-quantity-units-btlqp-5x9pd              0/1     Error       0          4h47m
rebuild-quantity-units-btlqp-c2qjb              0/1     Error       0          4h56m
rebuild-quantity-units-btlqp-c7fgp              0/1     Error       0          4h55m
rebuild-quantity-units-btlqp-crshn              0/1     Error       0          4h52m
rebuild-quantity-units-btlqp-csz7z              0/1     Error       0          4h56m
rebuild-quantity-units-btlqp-nvhgn              0/1     Error       0          4h55m

I'd like to check in on the status of this ticket. We are looking forward to having unit labels display as expected on more of the items in the WikiFCD wikibase.

I'd like to check in on the status of this ticket. We are looking forward to having unit labels display as expected on more of the items in the WikiFCD wikibase.

Hi!

We're back to looking at this now; we had to pause running this due to some urgent tasks that came in.

The job for is currently running. It is starting at Q1 and working it's way up to the end of wikifcd.wikibase.cloud. We expect it to take many hours to run (and it will need to also be run on an number of other wikis

Been going nearly 4 days: up to Q175641

Last error message:

Updating Q217797: revision: 612953 updates: 17
Wikibase\DataModel\Services\Lookup\EntityLookupException from line 51 of /var/www/html/w/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php: Missing lazy connection arguments.
#0 /var/www/html/w/extensions/Wikibase/lib/packages/wikibase/data-model-services/src/Lookup/RedirectResolvingEntityLookup.php(51): Wikibase\Lib\Store\RevisionBasedEntityLookup->getEntity(Object(Wikibase\DataModel\Entity\ItemId))
#1 /var/www/html/w/extensions/Wikibase/repo/maintenance/EntityQuantityUnitRebuilder.php(119): Wikibase\DataModel\Services\Lookup\RedirectResolvingEntityLookup->getEntity(Object(Wikibase\DataModel\Entity\ItemId))
#2 /var/www/html/w/extensions/Wikibase/repo/maintenance/EntityQuantityUnitRebuilder.php(91): Wikibase\Repo\Maintenance\EntityQuantityUnitRebuilder->rebuildEntityQuantityForUnit(Array)
#3 /var/www/html/w/extensions/Wikibase/repo/maintenance/rebuildEntityQuantityUnit.php(77): Wikibase\Repo\Maintenance\EntityQuantityUnitRebuilder->rebuild()
#4 /var/www/html/w/maintenance/doMaintenance.php(112): Wikibase\Repo\Maintenance\RebuildEntityQuantityUnit->execute()
#5 /var/www/html/w/extensions/Wikibase/repo/maintenance/rebuildEntityQuantityUnit.php(112): require_once('/var/www/html/w...')
#6 {main}
InvalidArgumentException from line 58 of /var/www/html/w/includes/libs/rdbms/database/DBConnRef.php: Missing lazy connection arguments.
#0 /var/www/html/w/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1098): Wikimedia\Rdbms\DBConnRef->__construct(Object(Wikimedia\Rdbms\LoadBalancer), false, -1)
#1 /var/www/html/w/includes/libs/rdbms/connectionmanager/ConnectionManager.php(92): Wikimedia\Rdbms\LoadBalancer->getConnectionRef(-1, Array, 'mwdb_wbstack_be...')
#2 /var/www/html/w/includes/libs/rdbms/connectionmanager/ConnectionManager.php(169): Wikimedia\Rdbms\ConnectionManager->getConnectionRef(-1, Array)
#3 /var/www/html/w/extensions/Wikibase/lib/includes/Store/Sql/WikiPageEntityMetaDataLookup.php(90): Wikimedia\Rdbms\ConnectionManager->getReadConnectionRef()
#4 /var/www/html/w/extensions/Wikibase/lib/includes/Store/Sql/TypeDispatchingWikiPageEntityMetaDataAccessor.php(88): Wikibase\Lib\Store\Sql\WikiPageEntityMetaDataLookup->loadRevisionInformation(Array, 'replica')
#5 /var/www/html/w/extensions/Wikibase/lib/includes/Store/Sql/PrefetchingWikiPageEntityMetaDataAccessor.php(267): Wikibase\Lib\Store\Sql\TypeDispatchingWikiPageEntityMetaDataAccessor->loadRevisionInformation(Array, 'replica')
#6 /var/www/html/w/extensions/Wikibase/lib/includes/Store/Sql/PrefetchingWikiPageEntityMetaDataAccessor.php(189): Wikibase\Lib\Store\Sql\PrefetchingWikiPageEntityMetaDataAccessor->doFetch('replica')
#7 /var/www/html/w/extensions/Wikibase/lib/includes/Store/Sql/WikiPageEntityRevisionLookup.php(105): Wikibase\Lib\Store\Sql\PrefetchingWikiPageEntityMetaDataAccessor->loadRevisionInformation(Array, 'replica')
#8 /var/www/html/w/extensions/Wikibase/lib/includes/Store/TypeDispatchingEntityRevisionLookup.php(52): Wikibase\Lib\Store\Sql\WikiPageEntityRevisionLookup->getEntityRevision(Object(Wikibase\DataModel\Entity\ItemId), 0, 'replica')
#9 /var/www/html/w/extensions/Wikibase/data-access/src/ByTypeDispatchingEntityRevisionLookup.php(55): Wikibase\Lib\Store\TypeDispatchingEntityRevisionLookup->getEntityRevision(Object(Wikibase\DataModel\Entity\ItemId), 0, 'replica')
#10 /var/www/html/w/extensions/Wikibase/lib/includes/Store/TypeDispatchingEntityRevisionLookup.php(52): Wikibase\DataAccess\ByTypeDispatchingEntityRevisionLookup->getEntityRevision(Object(Wikibase\DataModel\Entity\ItemId), 0, 'replica')
#11 /var/www/html/w/extensions/Wikibase/lib/includes/Store/CacheRetrievingEntityRevisionLookup.php(75): Wikibase\Lib\Store\TypeDispatchingEntityRevisionLookup->getEntityRevision(Object(Wikibase\DataModel\Entity\ItemId), 0, 'replica')
#12 /var/www/html/w/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php(46): Wikibase\Lib\Store\CacheRetrievingEntityRevisionLookup->getEntityRevision(Object(Wikibase\DataModel\Entity\ItemId), 0, 'replica')
#13 /var/www/html/w/extensions/Wikibase/lib/packages/wikibase/data-model-services/src/Lookup/RedirectResolvingEntityLookup.php(51): Wikibase\Lib\Store\RevisionBasedEntityLookup->getEntity(Object(Wikibase\DataModel\Entity\ItemId))
#14 /var/www/html/w/extensions/Wikibase/repo/maintenance/EntityQuantityUnitRebuilder.php(119): Wikibase\DataModel\Services\Lookup\RedirectResolvingEntityLookup->getEntity(Object(Wikibase\DataModel\Entity\ItemId))
#15 /var/www/html/w/extensions/Wikibase/repo/maintenance/EntityQuantityUnitRebuilder.php(91): Wikibase\Repo\Maintenance\EntityQuantityUnitRebuilder->rebuildEntityQuantityForUnit(Array)
#16 /var/www/html/w/extensions/Wikibase/repo/maintenance/rebuildEntityQuantityUnit.php(77): Wikibase\Repo\Maintenance\EntityQuantityUnitRebuilder->rebuild()
#17 /var/www/html/w/maintenance/doMaintenance.php(112): Wikibase\Repo\Maintenance\RebuildEntityQuantityUnit->execute()
#18 /var/www/html/w/extensions/Wikibase/repo/maintenance/rebuildEntityQuantityUnit.php(112): require_once('/var/www/html/w...')
#19 {main}

Last successful looking log line before the errors started:

Updating Q217796: revision: 612952 updates: 17

rerunning WBS_DOMAIN=wikifcd.wikibase.cloud WBS_UNIT_FROM="http://wikifcd.wiki.opencura.com" WBS_UNIT_TO="https://wikifcd.wikibase.cloud" ./rebuildQuantityUnitsJob.sh and adjusted the job yaml slightly to try and speed past the start.

- name: rebuild-quantity-units
          command:
            - 'bash'
            - '-c'
            - >
                MW_INSTALL_PATH=/var/www/html/w/
                php
                /var/www/html/w/extensions/Wikibase/repo/maintenance/rebuildEntityQuantityUnit.php
                --from-value="${WBS_UNIT_FROM}"
                --to-value="${WBS_UNIT_TO}"
                --sleep=1
                --batch-size=2500

In future we may regret not having utilised \Wikibase\DataModel\Services\EntityId\SeekableEntityIdPager::setPosition in order to skip ahead to where the thing fell over.

Made it to Q220576 from Q217797 in around one hour. Given that we are aiming for Q536300 we're still under half way(!) this thing might need to be accelerated somehow

This is approx 2.7k per hour; at this rate it will take around another 100 hours to finish; that would be fine but it seems like this rate drastically drops off over time

Made it as far as Q263558 before falling over in the same way:

[error] [DBConnection] Error connecting to sql-mariadb-secondary.default.svc.cluster.local as user mwu_8d1ca99504: :real_connect(): (HY000/2002): Connection refused                        
#0 /var/www/html/w/includes/libs/rdbms/database/DatabaseMysqlBase.php(146): Wikimedia\Rdbms\Database->newExceptionAfterConnectError(string)                                                 
#1 /var/www/html/w/includes/libs/rdbms/database/Database.php(5150): Wikimedia\Rdbms\DatabaseMysqlBase->open(string, string, string, string, NULL, string)                                   
#2 /var/www/html/w/includes/libs/rdbms/database/Database.php(1502): Wikimedia\Rdbms\Database->replaceLostConnection(string)                                                                 
#3 /var/www/html/w/includes/libs/rdbms/database/Database.php(1398): Wikimedia\Rdbms\Database->executeQueryAttempt(string, string, boolean, string, integer)                                 
#4 /var/www/html/w/includes/libs/rdbms/database/Database.php(1323): Wikimedia\Rdbms\Database->executeQuery(string, string, integer)                                                         
#5 /var/www/html/w/includes/libs/rdbms/database/Database.php(2012): Wikimedia\Rdbms\Database->query(string, string, integer)                                                                
#6 /var/www/html/w/includes/libs/rdbms/database/Database.php(1874): Wikimedia\Rdbms\Database->select(string, array, array, string, array, array)

Thank you for your work to address this issue. I'd like to check in on the status of this ticket. We are looking forward to having unit labels display as expected on more of the items in the WikiFCD wikibase.

Visitors to WikiFCD have provided us with feedback that it is confusing to see the urls on the statements for these items. Due to the fact that many of the people who are visiting WikiFCD are not already familiar with Wikibase, it is not clear to them that these will be replaced by the labels for the units in the future. We would like to see the labels for the units on these statements so that visitors to WikiFCD can more easily understand the data we are presenting.

Hi @YULdigitalpreservation , thanks for the extra information provided in your latest comments, and apologies for not having resolved this issue yet.
Due to the size of the data, it takes a significant amount of time to run the operations required to fix this, so it has gotten into a bit of a stall. I'll move this up in our priorities to address as soon as possible, you will see the update here once it's moved into one of our Sprints again.

@ 2023-04-18T09:38:53+01:00
Updating Q556295: revision: 862243 updates: 14 on WikiFCD

@YULdigitalpreservation I'm happy to report that our script finally completed processing the Items on the WikiFCD wikibase. Please let us know if you still experience similar issues and thanks for your incredible patience! This took us very long.

I just also ran the script for all other affected wikis mentioned here as well:

cocreate-cologne.wikibase.cloud
zerowastecities.wikibase.cloud
enrich-nfdi4culture.wikibase.cloud
lod-working-group.wikibase.cloud
tdwg-cd.wikibase.cloud
sweopendata.wikibase.cloud
ld4-2021-conference.wikibase.cloud

Found one remaining affected Item in WikiFCD, not sure why it was not fixed but it seems to be the only one for that wiki: https://wikifcd.wikibase.cloud/wiki/Item:Q568010

That's helpful to know, thanks! I'm afraid that means our approach to fix this didn't really work out. :(

It looks like our maintenance script didn't take care of (at least) qualifiers and references with the same issue

Tarrow claimed this task.

Closed in favour of this new ticket; which we need to reprio since the title of this ticket is indeed done