Page MenuHomePhabricator

Increase $wgHTTPImportTimeout to a higher value on WMF wikis
Closed, ResolvedPublic

Description

To allow longer transwiki imports, let's try increasing $wgHTTPImportTimeout to, say, 50 or 55 seconds. Operations requested that it be kept below 60 seconds.

Event Timeline

Change 331946 merged by jenkins-bot:
Increase $wgHTTPImportTimeout to 50 seconds

https://gerrit.wikimedia.org/r/331946

Mentioned in SAL (#wikimedia-operations) [2017-01-18T14:39:31Z] <zfilipin@tin> Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:331946|Increase $wgHTTPImportTimeout to 50 seconds (T155209)]] (duration: 00m 39s)

The HTTP timeout has been increased to 50 seconds. I managed to import all 2,063 revisions of "Digital television" from enwiki to testwiki.

Please try some imports on your local wikis and see what this does for you. I'll leave this task open for a few days for people's feedback.

I tried to import 45k+ revisions of [[w:en:George W. Bush]] on https://test.wikipedia.org/w/index.php?title=Special:Import (which is a clearly unreasonable thing to do) and I got a blank page. We could try more tests just to document how many revisions sysops can reasonably expect to be able to import, but it's IMHO not too bad if a privileged user performing a rather extraordinary action sometimes gets an unfriendly result.

@Nemo_bis a blank page usually means something different than a timeout has happened. Probably a memory limit was hit; if we want to be able to import tens of thousands of revisions we might want to transform that into an async job instead, too.

@Nemo_bis a blank page usually means something different than a timeout has happened.

True. On the other hand, the timeout on the API request is a de facto limitation on the quantity of data you receive (and probably the amount of memory required to process it), since I expect it to be highly correlated to processing and transfer time. Or not?

Probably a memory limit was hit; if we want to be able to import tens of thousands of revisions we might want to transform that into an async job instead, too.

Yes, this is a separate feature request to consider seriously. This task should only be considered a way to get the most out of the current system, I think.

Probably a memory limit was hit; if we want to be able to import tens of thousands of revisions we might want to transform that into an async job instead, too.

Yes, this is a separate feature request to consider seriously. This task should only be considered a way to get the most out of the current system, I think.

The import system could really do with a substantial overhaul to provide features like warnings/confirmations, more options on where to store the imported pages, and asynchronicity. But it doesn't seem worth the effort, given how infrequently its limitations cause problems.

LSobanski claimed this task.
LSobanski subscribed.

It doesn't seem like there's anything actionable left in this task, resolving.