Page MenuHomePhabricator

Can't delete a page on 1.24.0 with postgresql 8.4 on CentOS 6.6
Closed, ResolvedPublic

Description

When trying to delete a page on mediawiki 1.24.0, the page fails to delete. With debugging on, this is shown:

Database error

A database query error has occurred. This may indicate a bug in the software.

Backtrace:

#0 /var/www/plan.alteeve.ca/html/includes/db/DatabasePostgres.php(537): DatabaseBase->reportQueryError('ERROR: insert ...', '23503', 'COMMIT', 'WikiPage::doDel...', false)
#1 /var/www/plan.alteeve.ca/html/includes/db/Database.php(1182): DatabasePostgres->reportQueryError('ERROR: insert ...', '23503', 'COMMIT', 'WikiPage::doDel...', false)
#2 /var/www/plan.alteeve.ca/html/includes/db/Database.php(3596): DatabaseBase->query('COMMIT', 'WikiPage::doDel...')
#3 /var/www/plan.alteeve.ca/html/includes/db/Database.php(3580): DatabaseBase->doCommit('WikiPage::doDel...')
#4 /var/www/plan.alteeve.ca/html/includes/page/WikiPage.php(2894): DatabaseBase->commit('WikiPage::doDel...')
#5 /var/www/plan.alteeve.ca/html/includes/page/Article.php(1793): WikiPage->doDeleteArticleReal('content was: "s...', false, 0, true, '')
#6 /var/www/plan.alteeve.ca/html/includes/page/Article.php(1606): Article->doDelete('content was: "s...', false)
#7 /var/www/plan.alteeve.ca/html/includes/actions/DeleteAction.php(51): Article->delete()
#8 /var/www/plan.alteeve.ca/html/includes/MediaWiki.php(414): DeleteAction->show()
#9 /var/www/plan.alteeve.ca/html/includes/MediaWiki.php(282): MediaWiki->performAction(Object(Article), Object(Title))
#10 /var/www/plan.alteeve.ca/html/includes/MediaWiki.php(584): MediaWiki->performRequest()
#11 /var/www/plan.alteeve.ca/html/includes/MediaWiki.php(435): MediaWiki->main()
#12 /var/www/plan.alteeve.ca/html/index.php(46): MediaWiki->run()

#13 {main}

An email gets sent out saying the page was deleted. In the webserver's error logs is a single entry:

[Fri Nov 28 16:18:38 2014] [error] [client 72.64.152.130] PHP Notice: Uncommitted DB writes (transaction from DatabaseBase::query (LCStoreDB::get)). in /var/www/plan.alteeve.ca/html/includes/db/Database.php on line 4266, referer: http://plan.alteeve.ca/index.php?title=The_power_Of_your_Intellect_In_Money_making&action=delete

digimer

Event Timeline

Digimer raised the priority of this task from to Needs Triage.
Digimer updated the task description. (Show Details)
Digimer changed Security from none to None.
Digimer subscribed.

This error was shown when I enabled simple SQL debugging:

A database query error has occurred. This may indicate a bug in the software.

Query:
COMMIT
Function: WikiPage::doDeleteArticleReal
Error: 23503 ERROR: insert or update on table "recentchanges" violates foreign key constraint "recentchanges_rc_cur_id_fkey" DETAIL: Key (rc_cur_id)=(7) is not present in table "page".

bug was introduced by change I1c7f3a84f10df05d6b37dccbad4c8232edf51580

It seems like this should be caught by one of the phpunit tests, specifically includes/EditPageTest.php. Does anyone know why it is not being caught there?

bug was introduced by change I1c7f3a84f10df05d6b37dccbad4c8232edf51580

Full links are welcome: https://gerrit.wikimedia.org/r/#/c/113525/

CC'ing the author and reviewers of that patch.

It seems like this should be caught by one of the phpunit tests, specifically includes/EditPageTest.php. Does anyone know why it is not being caught there?

Because we don't test against a database that uses foreign key constraints.

The intention of I1c7f3a84f10df05d6b37dccbad4c8232edf51580 was to log the former page_id if the now-deleted page, which breaks the assumption that rc_cur_id is going to always refer to a row in the page table. There are two possible fixes:

  1. Revert that change.
  2. Remove the assumption.

What do you mean about us not testing against "a database that uses foreign key constraints"?

Just that the CI server only tests on sqlite and not against postgresql or mysql? I know about that, but surely people run such tests outside of the CI framework. At least I do, and didn't find this error.

Or is there a separate issue?

Change 178271 had a related patch set uploaded (by Jjanes):
PostgreSQL: Drop unneeded foreign key constraint

https://gerrit.wikimedia.org/r/178271

Patch-For-Review

@Anomie: If you have time:

What do you mean about us not testing against "a database that uses foreign key constraints"?

I mean that our pre-merge unit tests run by Jenkins don't run against such a database.

As noted, such things can be run locally by someone who has such a database configured.

Change 187808 had a related patch set uploaded (by Legoktm):
PostgreSQL: Drop unneeded foreign key constraint

https://gerrit.wikimedia.org/r/187808

Patch-For-Review

Change 178271 merged by jenkins-bot:
PostgreSQL: Drop unneeded foreign key constraint

https://gerrit.wikimedia.org/r/178271

Aklapper triaged this task as Medium priority.Jan 31 2015, 8:14 PM

Change 178271 merged by jenkins-bot:
PostgreSQL: Drop unneeded foreign key constraint

https://gerrit.wikimedia.org/r/178271

@Legoktm: with that merge, is this now just in need of a backport to the releases?

Now I see why this problem was not caught by the unit test includes/EditPageTest.php

Unit tests are run against a clone of the database, and the clone is lacking constraints.

Since the unit tests prominently warn not to run them against a production database, I don't know why we would run against a clone rather than the primary set of tables. It seems to be the worst of both worlds, we still treat it as being destructive, but get a test which is not strictly adherent to the real conditions.

The unit test mess is covered by T39702; it causes inter alia T39600 and T38759.

Change 187808 merged by jenkins-bot:
PostgreSQL: Drop unneeded foreign key constraint

https://gerrit.wikimedia.org/r/187808

Umherirrender claimed this task.
Umherirrender subscribed.

Backported to Release 1.24

Jdforrester-WMF subscribed.

Migrating from the old tracking task to a tag for PostgreSQL-related tasks.