Page MenuHomePhabricator

SqlBagOStuff for PostgreSQL should not re-use main connection
Closed, ResolvedPublic

Description

MediaWiki	1.30.0-alpha (1ba29a5)
PHP	7.1.1 (apache2handler)
PostgreSQL	9.3.12
ICU	57.1

"Rejected set ... due to pending writes" is constantly reported with:

[MessageCache] MessageCache::load: Loading en... local cache is empty, global cache is expired/volatile, loading from database
[objectcache] Rejected set() for global:resourceloader:titleinfo:mwmasterpg:432bad786e4c35990f47bc567a0c523303fe1c48 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:jquery.accessKeyLabel:en:f91d56e8550123299aa6a008325e6a10 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:jquery.confirmable:en:0e66522d242595cb343f4a7191b58641 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:jquery.makeCollapsible:en:2e1aec29500269620cc8fe965cd4f966 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:jquery.tablesorter:en:12800011052df520c7810d5356349e24 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.feedback:en:72f32a6ea2edd8e32d6751d2ee1eac88 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.htmlform:en:87aae6a3d20a24b31993b5cb638a1c64 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.searchSuggest:en:d36d2470c7b5851ce578c54ba956bbf0 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.ForeignUpload:en:bb36660cdb75ce155469babe5104acf4 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.ForeignStructuredUpload:en:e92f2f4df425c159a113f15d13fa553b due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.Upload.Dialog:en:2025422b649e11cb5eb9a7741a194c41 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.Upload.BookletLayout:en:fbaf02a3a219c1feed950c8f362080a2 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.ForeignStructuredUpload.BookletLayout:en:7127b5dc6f65db4b26b61e0825047ee2 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.toc:en:11c0571606cd3d304339566a72dae9f2 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.action.edit.preview:en:edaff49d2d365c28d7ddbaef4e48c70f due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.action.view.metadata:en:191a7b6d83712a8e27678c835d2d6bb0 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.action.view.postEdit:en:38f42593b1c2e69fe7918a892311cb58 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.action.edit.editWarning:en:34a87ba55d376b1b89dd29fc7e4a93b7 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.language:en:3a9f499e2e83e3fb39217173e73ad1c1 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.language.months:en:73af9e0086587a82daaf89854b1d5aa0 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.language.specialCharacters:en:6c63baaca3399f8893662ffe6206ade0 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.page.gallery.slideshow:en:7b71d5c963c72a156de94ba32ec74b36 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.page.patrol.ajax:en:e623c8a17c7b58f1b99f47c9787d1845 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.page.watch.ajax:en:e23b9ddd5ea484cdfd7ffeb48e0af93e due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.page.rollback:en:cd005c971fc9dae149b49ac9f0b9061a due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.rcfilters.filters.ui:en:2f7f0aaeebf1c4e8ea0c51b26f9c512c due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.apisandbox:en:97aa5c359bd4e879a8a394df05accfb4 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.edittags:en:01e3f8e45d97a4ddcb7bb5cb12d14be8 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.preferences:en:ed16e7d232dfc6a4ebd67d7d93ee3c4a due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.search:en:4dbe2b042bdc997ca60239586aec3cfe due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.search.commonsInterwikiWidget:en:a8e0068cc00875ce14599886df6f8594 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.unwatchedPages:en:9217d503ffba30579df8bad359632980 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.upload:en:09fcab4aaa44d45db01d81d353b0a38f due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.special.userlogin.signup.js:en:c2117cbe2be555982396251992e6ab62 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.legacy.protect:en:cdaff2a99116d4d803737c71a50a2a44 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.widgets:en:66a66131af61df7bf02ec5bead2af9ad due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.widgets.DateInputWidget:en:eeb737376629d6da9647b8c8812eea5d due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.widgets.datetime:en:e0aa3ea0e47ff391fce39fc8029b31b4 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.widgets.CategoryMultiselectWidget:en:7dd073b2861bbd2700d585db817c7f7b due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:mediawiki.widgets.MediaSearch:en:9412b80bc8c4de91bcf08e862dd7ee96 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:oojs-ui-widgets:en:c8222ef4212351752e5595676be2a6bb due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:oojs-ui-toolbars:en:95f32f46a33e3d104160f47c8d3a2126 due to pending writes.
[objectcache] Rejected set() for mwmasterpg:MessageBlobStore:oojs-ui-windows:en:86ef3a9ba0fc10cbab1e061d34ed17a2 due to pending writes.

Event Timeline

Krinkle renamed this task from Mass "Rejected set ... due to pending writes" being reported to "Rejected set() ... due to pending writes" from WANObjectCache when using SqlBagOStuff (via MessageBlobStore).Jul 6 2017, 4:04 AM

MessageBlobStore is used by ResourceLoader.

The warning in question comes from WANObjectCache::set()

	function set( $key, $value, $ttl = 0, array $opts = [] ) {
		..

		// Do not cache potentially uncommitted data as it might get rolled back
		if ( !empty( $opts['pending'] ) ) {
			$this->logger->info( "Rejected set() for $key due to pending writes." );
			..

@mwjames

  1. I presume this setup uses CACHE_DB?
  2. Can you verify whether this is a recent regression? The warning in question was introduced 2 years ago and released in 1.27.0. I don't see any changes to this in 1.30-alpha development. Narrowing it down to a specific release would help.
  3. Does this happen only on POST requests, or also on GET requests? Specifically, is it failing due to wrongly caring about genuine pending writes (like on a POST request that makes an edit), or is this a GET request with no real master writes, but being triggered by other cache writes? Isolating it to one of these scenarios woud help.

@aaron Assuming for the moment this is on a POST request with genuine writes, I can see how it might make sense. MessageBlobStore gets its information from MessageCache via wfMessage. As such, in MessageBlobStore we pass the result of Database::getCacheSetOptions( wfGetDB( DB_REPLICA ) ) to wanCache->set(). On a POST request, this cache get-with-set can easily happen mid-way in the request when the main transaction is still open. I imagine that isn't meant to trigger the "pending writes" warning, and yet it does. Perhaps this is because slave/master are the same? What can we do here?

Krinkle triaged this task as High priority.Jul 6 2017, 4:13 AM
Krinkle moved this task from General to libs/objectcache on the MediaWiki-libs-BagOStuff board.

If SqlBagOStuff::getSeparateMainLB() return a new LB for postgres like mysql, this would probably not be an issue since writesOrCallbacksPending() would not return true on the main DB_MASTER connection but on separate one.

Krinkle renamed this task from "Rejected set() ... due to pending writes" from WANObjectCache when using SqlBagOStuff (via MessageBlobStore) to SqlBagOStuff for PostgreSQL should not re-use main connection.Jul 6 2017, 7:46 PM

I didn't see this was specific to PostgreSQL, but after failing to reproduce the premise, Aaron pointed out "pg" in the wiki name.

Indeed, as per T154424, this is all because SqlBagOStuff on PostgreSQL ends up using the same exact LoadBalancer and Database object, and underlying connection.

At the moment the code intentionally doesn't try to create multiple simultaneous PostgreSQL connections due to a known bug (from years ago) about PG driver in PHP silently de-duplicating connections, which would cause lots of state problems.

Unless PostgreSQL in PHP supports making separate connections, and we make use of that, these warnings cannot be fixed, other than by configuring wgDebugLogGroups to silence SQLBagOStuff, objectcache and DBPerformance (or at least raise them to only report severity ERROR and above).

Change 363727 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] [WIP] Use a separate postgres connection in SqlBagOStuff

https://gerrit.wikimedia.org/r/363727

@mwjames Could you try the above patch and let us know if this works for you? It looks like PGSQL_CONNECT_FORCE_NEW makes pg_connect behave similar my MySQL connect. This was adopted in fda4d46fc4f810, and the above patch would apply it to SqlBagOStuff.

... specific to PostgreSQL, but after failing to reproduce the premise, Aaron pointed out "pg" in the wiki name.

As noted in the first four lines of the report.

MediaWiki	1.30.0-alpha (1ba29a5)
PHP	7.1.1 (apache2handler)
PostgreSQL	9.3.12
ICU	57.1

[mediawiki/core@master] [WIP] Use a separate postgres connection in SqlBagOStuff
https://gerrit.wikimedia.org/r/363727

I did a quick checkout and it seems that it fixes the connection issue with logs only containing:

[MessageCache] MessageCache::load: Loading en... local cache is empty, global cache is expired/volatile, loading from database
Unstubbing $wgParser on call of $wgParser::firstCallInit from MessageCache->transform
Parser: using preprocessor: Preprocessor_DOM
Unstubbing $wgLang on call of $wgLang::_unstub from ParserOptions->__construct
...

This was adopted in fda4d46fc4f810, ...
https://gerrit.wikimedia.org/r/363727

@demon I think this should be back-ported to 1.29.

Krinkle assigned this task to aaron.
Krinkle edited projects, added Performance-Team; removed Patch-For-Review.

@mwjames Thanks!

Change 363727 merged by jenkins-bot:
[mediawiki/core@master] objectcache: Use a separate postgres connection in SqlBagOStuff

https://gerrit.wikimedia.org/r/363727

Change 363995 had a related patch set uploaded (by Chad; owner: Aaron Schulz):
[mediawiki/core@REL1_29] objectcache: Use a separate postgres connection in SqlBagOStuff

https://gerrit.wikimedia.org/r/363995

This was adopted in fda4d46fc4f810, ...
https://gerrit.wikimedia.org/r/363727

@demon I think this should be back-ported to 1.29.

In which case, I suggest those driving the patch press the cherry-pick button and get a patch up for review on the target branch. I get thousands of bug mails, and pinging me like this is a bad way to make sure I remember to backport it (especially if the task is closed). It's now backported.

In which case, I suggest those driving the patch press the cherry-pick button and get a patch up for review on the target branch.

There is surely a query you can run to verify that those marked with a specific release have been back-ported without relying on human to verify that this has actually happened or?

In which case, I suggest those driving the patch press the cherry-pick button and get a patch up for review on the target branch.

There is surely a query you can run to verify that those marker with a specific release have been back-ported without relying on human to verify that this has actually happened or?

Of course there's nothing like that, we've never been that fancy at tracking backports. If something is needing backporting, the task should not be closed just because it landed in master. It isn't fixed everywhere

EDIT: I'll add that this isn't trivial to build a tool around either. There's no clear mapping between a task and all the changes that exist and need to be backported (or modified before backporting). There's too much human-driven context here to write a tool right now. Nowwww, if someone wanted to build such a tool, I'm sure it'd find uses outside of MW general releases (cf: T280)

Change 363995 merged by jenkins-bot:
[mediawiki/core@REL1_29] objectcache: Use a separate postgres connection in SqlBagOStuff

https://gerrit.wikimedia.org/r/363995