Page MenuHomePhabricator

maintenance/dumpBackup.php does not dump when 'actor' is in $wgSharedTables
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  1. wiki farm through shared DB. At least: 1 base wiki containing the accounts (wiki_1), 1 wiki using the shared table (wiki_2)
  2. add actor in $wgSharedTables
// minimal change
$wgSharedDB = 'wiki_1_shared_db';
$wgSharedTables = array_merge($wgSharedTables, ['actor']);
  1. execute dump on the wiki_2 php maintenace/dumpBackup.php --wiki wiki_2 --current

What happens?:
Dump only contains NS result

<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.11/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.11/ http://www.mediawiki.org/xml/export-0.11.xsd" version="0.11" xml:lang="en">
  <siteinfo>
    <sitename>XXXXXXXXXXXXXX</sitename>
    <dbname>XXXXXXXXXXXXXX</dbname>
    <base>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</base>
    <generator>MediaWiki 1.39.4</generator>
    <case>first-letter</case>
    <namespaces>
      <namespace key="-2" case="first-letter">Media</namespace>
      ...
      <namespace key="2303" case="case-sensitive">Gadget definition talk</namespace>
    </namespaces>
  </siteinfo>
</mediawiki>

What should have happened instead?:
Dump should be output full dump

Software version (skip for WMF-hosted wikis like Wikipedia):
REL1_39

Other information (browser name/version, screenshots, etc.):

  1. patch that fixed this issue:
diff --git a/maintenance/includes/BackupDumper.php b/maintenance/includes/BackupDumper.php
index 74f8202868c..3f5b2de3920 100644
--- a/maintenance/includes/BackupDumper.php
+++ b/maintenance/includes/BackupDumper.php
@@ -405,8 +405,7 @@ abstract class BackupDumper extends Maintenance {
 			return $this->forcedDb;
 		}
 
-		$lbFactory = MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
-		$this->lb = $lbFactory->newMainLB();
+		$this->lb = MediaWikiServices::getInstance()->getDBLoadBalancer();
 		$db = $this->lb->getMaintenanceConnectionRef( DB_REPLICA, 'dump' );
 
 		// Discourage the server from disconnecting us if it takes a long time
  1. This also affects DumpsOnDemand extension. Fix applied:
diff --git a/src/Jobs/DoDatabaseDumpJob.php b/src/Jobs/DoDatabaseDumpJob.php
index cbf96f6..061a9ad 100644
--- a/src/Jobs/DoDatabaseDumpJob.php
+++ b/src/Jobs/DoDatabaseDumpJob.php
@@ -47,7 +47,10 @@ class DoDatabaseDumpJob extends Job implements GenericParameterJob {
         * @return bool
         */
        public function run(): bool {
-           $dbr = $this->lbFactory->newMainLB()->getMaintenanceConnectionRef( DB_REPLICA, 'dump' );
+         $dbr = MediaWikiServices::getInstance()
+                 ->getDBLoadBalancer()
+                 ->getMaintenanceConnectionRef( DB_REPLICA, 'dump' );
+
                $dbr->setSessionOptions( [ 'connTimeout' => 3600 ] );
 
                if ( $this->params['fullHistory'] ) {
  1. Running newMainLB(); does not seem to apply table aliases when $wgSharedTables and $wgSharedDB are defined. This seems to be needed to be executed for new LBs
if ( $wgSharedDB && $wgSharedTables ) {
	// Apply $wgSharedDB table aliases for the local LB (all non-foreign DB connections)
	MediaWikiServices::getInstance()->getDBLoadBalancer()->setTableAliases(
		array_fill_keys(
			$wgSharedTables,
			[
				'dbname' => $wgSharedDB,
				'schema' => $wgSharedSchema,
				'prefix' => $wgSharedPrefix
			]
		)
	);
}

Event Timeline

Change 975878 had a related patch set uploaded (by Mainframe98; author: Mainframe98):

[mediawiki/extensions/DumpsOnDemand@REL1_40] Respect $wgSharedTables in DoDatabaseDumpJob

https://gerrit.wikimedia.org/r/975878

Mainframe98 subscribed.

Hmm, the reason BackupDumper (and by extension MediaWiki-extensions-DumpsOnDemand) uses newMainLB is because of rMW506a19b7af78: Cleanup live hack from wmf-deployment r53208 a bit: DB selection using load… (r56347), where it mentions that not doing this might "have caused problems when we were fetching other things from the local database in the middle of an export". For MediaWiki-extensions-DumpsOnDemand, the release for 1.41 uses IConnectionProvider, which performs as proposed. I'm not sure what the impact would be, but given that IConnectionProvider does not have special handling for this (and the underlying internals have been greatly overhauled since 2009), I suspect this should be harmless. I'll create a patch for DumpsOnDemand for MediaWiki 1.40, which I'll backport to 1.39 for posterity, as 1.41 has not yet been released.

@Wildtron, did you want to create an accompanying patch for MediaWiki?

Change 975886 had a related patch set uploaded (by Mainframe98; author: Mainframe98):

[mediawiki/extensions/DumpsOnDemand@REL1_39] Respect $wgSharedTables in DoDatabaseDumpJob

https://gerrit.wikimedia.org/r/975886

What about other usages of newMainLB(), seems like a common pattern for dump related code?
ex
maintenance\includes\TextPassDumper.php
maintenance\includes\BackupDumper.php

Change #975878 abandoned by Umherirrender:

[mediawiki/extensions/DumpsOnDemand@REL1_40] Respect $wgSharedTables in DoDatabaseDumpJob

Reason:

MediaWiki 1.40 is End of Life

https://gerrit.wikimedia.org/r/975878

Change #1078491 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] Cleanup connection handling in BackupDumper/TextPassDumper

https://gerrit.wikimedia.org/r/1078491

Change #1078491 merged by jenkins-bot:

[mediawiki/core@master] Cleanup connection handling in BackupDumper/TextPassDumper

https://gerrit.wikimedia.org/r/1078491

Change #975886 abandoned by Mainframe98:

[mediawiki/extensions/DumpsOnDemand@REL1_39] Respect $wgSharedTables in DoDatabaseDumpJob

Reason:

No need for this change anymore

https://gerrit.wikimedia.org/r/975886