Page MenuHomePhabricator

Parsoid exceptionally slow for one user's set-up (IIS), so editing pages times out every time
Closed, ResolvedPublic

Description

Author: jazeckel

Description:
wiki debug log for navigating to and attempting to edit a page

I am attempting to configure the VisualEditor REL1_23 to run with my MediaWiki instance. I am using MediaWiki 1.23 and it is a private wiki. It's served by IIS on Windows Server 2012 R2. I am able to create a very simple page using it, although it runs slowly. However, any attempt to edit the page -- even a page containing only one simple sentence -- times out with Parsoid. I ran all of the Parsoid tests following installation and they all passed, and from Parsoid's test webpage (localhost:8000) I was able to run those tests and they succeed quickly. However, I can't edit any pages of my wiki. This is the output from the Parsoid service during an edit attempt:

[warning/api][null/Test] Failed API request, 8 retries remaining.
[info][localhost/Test] starting parsing
[info][localhost/Test?oldid=834] completed parsing in 45006 ms

The completed parsing line occurs after I click cancel on the prompt telling me the request timed out. When I had the timeout set to 60s, I would see "8 tries remaining" and then "7 tries remaining" before it timed out (I have since changed the timeout to 45s). The content of the page in this example is simply "this is a visual editor test."

Relevant information:

  • Windows Server 2012 R2
  • IIS v8.5.9600.16384
  • 32-bit node.js
  • Parsoid on port 8000
  • Browser used is Chrome 35.0.1916.153 m
  • MediaWiki 1.23, set to require log in to view/edit
  • same problem in all skins that VisualEditor runs on

A debug log from MediaWiki is attached.


Version: unspecified
Severity: normal
OS: other
Platform: PC

Attached:

Details

Reference
bz67313

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:37 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz67313.

This is pretty unhelpful, yes. 45 seconds is an insanely long time to parse the page - [[en:Barack Obama]] takes less than 12 seconds to parse, so it sounds like something is wrong there.

Punting to the Parsoid team to ask for advice (and they'll probably need some more information to work out why Parsoid is so slow for you).

If this server is going to be used by a number of people, I'd recommend putting a caching layer in front of Parsoid and using the Parsoid PHP extension to push changes to pages into the cache so they're available instantly.

As a first step, it would be useful to have some more information on your node.js setup, since we don't habitually test Parsoid on Windows. Let's start with the output of 'node --version', some hardware information on the server in question (is it a virtual server? If not, what is the CPU and clock speed and cache size), and how long 'tests/parserTests.js --quiet' takes to run. With that we should at least narrow down the problem to node performance in general, or some quirk peculiar to how parsoid and mediawiki are interconnected.

(Oh, and as a first guess at a diagnosis: it seems suspiciously like one or more of the 30-second timeouts are expiring internal to Parsoid -- parsoid uses internal requests to the api.php of the containing wiki; if those are timing out it is likely that parsoid will take a long time to reply. But let's double-check the speed of parserTests and underlying node to be sure.)

jazeckel wrote:

  1. node --version reports v0.10.29
  2. running tests/parserTests.js --quiet took 30 seconds, and reported no unexpected results.
  3. it is not a virtual server. It has 8GB RAM, and here's all the info about the CPU. Not sure if CPU cache is the cache you wanted to know about or a different cache.

Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
Intel64 Family 6 Model 45 Stepping 7, GenuineIntel
HTT * Hyperthreading enabled
HYPERVISOR - Hypervisor is present
VMX * Supports Intel hardware-assisted virtualization
SVM - Supports AMD hardware-assisted virtualization
EM64T * Supports 64-bit mode

SMX * Supports Intel trusted execution
SKINIT - Supports AMD SKINIT

NX * Supports no-execute page protection
SMEP - Supports Supervisor Mode Execution Prevention
SMAP - Supports Supervisor Mode Access Prevention
PAGE1GB * Supports 1 GB large pages
PAE * Supports > 32-bit physical addresses
PAT * Supports Page Attribute Table
PSE * Supports 4 MB pages
PSE36 * Supports > 32-bit address 4 MB pages
PGE * Supports global bit in page tables
SS * Supports bus snooping for cache operations
VME * Supports Virtual-8086 mode
RDWRFSGSBASE - Supports direct GS/FS base access

FPU * Implements i387 floating point instructions
MMX * Supports MMX instruction set
MMXEXT - Implements AMD MMX extensions
3DNOW - Supports 3DNow! instructions
3DNOWEXT - Supports 3DNow! extension instructions
SSE * Supports Streaming SIMD Extensions
SSE2 * Supports Streaming SIMD Extensions 2
SSE3 * Supports Streaming SIMD Extensions 3
SSSE3 * Supports Supplemental SIMD Extensions 3
SSE4a - Supports Sreaming SIMDR Extensions 4a
SSE4.1 * Supports Streaming SIMD Extensions 4.1
SSE4.2 * Supports Streaming SIMD Extensions 4.2

AES * Supports AES extensions
AVX * Supports AVX intruction extensions
FMA - Supports FMA extensions using YMM state
MSR * Implements RDMSR/WRMSR instructions
MTRR * Supports Memory Type Range Registers
XSAVE * Supports XSAVE/XRSTOR instructions
OSXSAVE * Supports XSETBV/XGETBV instructions
RDRAND - Supports RDRAND instruction
RDSEED - Supports RDSEED instruction

CMOV * Supports CMOVcc instruction
CLFSH * Supports CLFLUSH instruction
CX8 * Supports compare and exchange 8-byte instructions
CX16 * Supports CMPXCHG16B instruction
BMI1 - Supports bit manipulation extensions 1
BMI2 - Supports bit manipulation extensions 2
ADX - Supports ADCX/ADOX instructions
DCA * Supports prefetch from memory-mapped device
F16C - Supports half-precision instruction
FXSR * Supports FXSAVE/FXSTOR instructions
FFXSR - Supports optimized FXSAVE/FSRSTOR instruction
MONITOR * Supports MONITOR and MWAIT instructions
MOVBE - Supports MOVBE instruction
ERMSB - Supports Enhanced REP MOVSB/STOSB
PCLULDQ * Supports PCLMULDQ instruction
POPCNT * Supports POPCNT instruction
LZCNT - Supports LZCNT instruction
SEP * Supports fast system call instructions
LAHF-SAHF * Supports LAHF/SAHF instructions in 64-bit mode
HLE - Supports Hardware Lock Elision instructions
RTM - Supports Restricted Transactional Memory instructions

DE * Supports I/O breakpoints including CR4.DE
DTES64 * Can write history of 64-bit branch addresses
DS * Implements memory-resident debug buffer
DS-CPL * Supports Debug Store feature with CPL
PCID * Supports PCIDs and settable CR4.PCIDE
INVPCID - Supports INVPCID instruction
PDCM * Supports Performance Capabilities MSR
RDTSCP * Supports RDTSCP instruction
TSC * Supports RDTSC instruction
TSC-DEADLINE * Local APIC supports one-shot deadline timer
TSC-INVARIANT * TSC runs at constant rate
xTPR * Supports disabling task priority messages

EIST * Supports Enhanced Intel Speedstep
ACPI * Implements MSR for power management
TM * Implements thermal monitor circuitry
TM2 * Implements Thermal Monitor 2 control
APIC * Implements software-accessible local APIC
x2APIC * Supports x2APIC

CNXT-ID - L1 data cache mode adaptive or BIOS

MCE * Supports Machine Check, INT18 and CR4.MCE
MCA * Implements Machine Check Architecture
PBE * Supports use of FERR#/PBE# pin

PSN - Implements 96-bit processor serial number

PREFETCHW * Supports PREFETCHW instruction

Maximum implemented CPUID leaves: 0000000D (Basic), 80000008 (Extended).

Logical to Physical Processor Map:

  • Physical Processor 0
  • Physical Processor 1
  • Physical Processor 2
  • Physical Processor 3

Logical Processor to Socket Map:

  • Socket 0

Logical Processor to NUMA Node Map:

  • NUMA Node 0

Logical Processor to Cache Map:

  • Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64
  • Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64
  • Unified Cache 0, Level 2, 256 KB, Assoc 8, LineSize 64
  • Unified Cache 1, Level 3, 10 MB, Assoc 20, LineSize 64
  • Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64
  • Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64
  • Unified Cache 2, Level 2, 256 KB, Assoc 8, LineSize 64
  • Data Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64
  • Instruction Cache 2, Level 1, 32 KB, Assoc 8, LineSize 64
  • Unified Cache 3, Level 2, 256 KB, Assoc 8, LineSize 64
  • Data Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64
  • Instruction Cache 3, Level 1, 32 KB, Assoc 8, LineSize 64
  • Unified Cache 4, Level 2, 256 KB, Assoc 8, LineSize 64

Logical Processor to Group Map:

  • Group 0

Ok, I think we can rule out platform slowness. So now I think my theory in comment 3 is probably on the right track. Subbu, gwicke: do we have any tools to diagnose timeout issues in API requests made by Parsoid?

Created attachment 15795
Debugging patch to dump timing and status code of API requests

Well, we don't have anything built in, but the attached patch is a hack which will dump the request timing and status codes. If you apply this and then visit (say) http://localhost:8765/enwiki/Main_Page you should get a bunch of output on console about all the upstream API requests which Parsoid is making. I suspect one or more of them is timing out.

Attached:

jazeckel wrote:

Can you point me to instructions on how to apply the patch? I won't be able to do it until tomorrow morning. Thanks!

It might be easiest just to open lib/mediawiki.ApiRequest.js and apply the patch manually in your favorite editor. It's only nine lines to copy-and-paste. The lines with '+' at the front are additions. The single line with '-' is a removal.

jazeckel wrote:

Okay, I applied the patch, ran "node api/server.js" in the console, then loaded up my test page and clicked edit for the visual editor. Some time passed with no output, then I got this:

Request 1: http://localhost/auc/wiki/api.php?format=json&action=query&meta=sitei
nfo&siprop=namespaces%7Cnamespacealiases%7Cmagicwords%7Cfunctionhooks%7Cextensio
ntags%7Cgeneral%7Cinterwikimap%7Clanguages%7Cprotocols: 40016ms

TypeError: Cannot read property 'statusCode' of undefined

at Request.cb [as _callback] (D:\public\jazeckel\parsoid\lib\mediawiki.ApiRe

quest.js:99:49)

at self.callback (D:\public\jazeckel\parsoid\node_modules\request\request.js

:121:22)

at Request.emit (events.js:95:17)
at null._onTimeout (D:\public\jazeckel\parsoid\node_modules\request\request.

js:679:12)

at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)

worker 3128 died (8), restarting.

  • worker(3040) loading...
  • worker(3040) ready on :8000

After that I hit my 45 second timeout.

@jazeckel: so, when you visit http://localhost/auc/wiki/api.php?format=json&action=query&meta=siteinfo&siprop=namespaces%7Cnamespacealiases%7Cmagicwords%7Cfunctionhooks%7Cextensiontags%7Cgeneral%7Cinterwikimap%7Clanguages%7Cprotocols manually, what happens? Does anything show up in your PHP server logs?

It sounds like there's something wrong with your wiki's API entrypoint.

jazeckel wrote:

Hmm, so this is kind of interesting. If I go to the URL but with the server name instead of "localhost," I get what I assume is the correct result -- a big block of key value pairs about locales and such. If I go to the URL with "localhost", I get this:

{"error":{"code":"readapidenied","info":"You need read permission to use this module"}}

I don't see any PHP error logs created though perhaps I am not looking in the right place.

It sounds like your configured API URL should actually include the real server name instead of 'localhost'. Could you change the config & verify that this fixes your issue?

jazeckel wrote:

This is extremely confusing. Here are the current issues:

  1. I don't understand why changing out localhost for servername should make a difference, considering they are the same machine.
  2. If I change out localhost for servername in parsoid/api/localsettings.js, then when I try to edit the visual editor loads, but it loads with the content of a page from a different wiki that runs on the same server. Is there a missing database configuration in the Parsoid localsettings?
  3. If I change the wiki/localsettings.js to also use servername instead of localhost, it times out again.

jazeckel wrote:

To clarify, when it succeeds and loads the page from the wrong wiki, the URL I see in the debug output is "http://localhost/wiki/api.php" whereas the correct wiki is "http://localhost/auc/wiki/api.php", which is as specified in Parsoid's localsettings (although in localsettings it is specified now with servername instead of localhost).

jazeckel wrote:

Also, if I comment out the entire parsoidConfig.setInterwiki line in my localsettings.js, it loads the visual editor but with the wrong wiki's content. It appears that having "localhost" as the key on that line will cause it to fail regardless, and if I change "localhost" to anything else it loads the wrong content. For example, I could make the line:
parsoidConfig.setInterwiki( 'foo', 'http://localhost/auc/wiki/api.php' );

and it will load the editor but with the wrong content.

(In reply to jazeckel from comment #11)

Hmm, so this is kind of interesting. If I go to the URL but with the server
name instead of "localhost," I get what I assume is the correct result -- a
big block of key value pairs about locales and such. If I go to the URL
with "localhost", I get this:

{"error":{"code":"readapidenied","info":"You need read permission to use
this module"}}

I don't see any PHP error logs created though perhaps I am not looking in
the right place.

This is generated if the anonymous user doesn't have appropriate permissions to access the API. In include/DefaultSettings.php there is a group of permissions including:

$wgGroupPermissions['*']['writeapi'] = true;

I'm guessing that one of these is prohibiting anonymous access to the API. I bet it's not actually the case that it works when you use the server name, I bet that somehow you're actually accessing the wrong wiki in that case. I'm not a mediawiki permissions expert though, so take what I say with a grain of salt.

If you are mucking with permissions, this link may be useful:
https://www.mediawiki.org/wiki/Extension:VisualEditor#Linking_with_Parsoid_in_private_wikis

gwicke: we should probably not be timing out if the api request returns an error. Maybe we should look into this?

(In reply to jazeckel from comment #15)

Also, if I comment out the entire parsoidConfig.setInterwiki line in my
localsettings.js, it loads the visual editor but with the wrong wiki's
content. It appears that having "localhost" as the key on that line will
cause it to fail regardless, and if I change "localhost" to anything else it
loads the wrong content. For example, I could make the line:
parsoidConfig.setInterwiki( 'foo', 'http://localhost/auc/wiki/api.php' );

and it will load the editor but with the wrong content.

Yes, the 'localhost' part here has to match the string you set $wgVisualEditorParsoidPrefix to.

In particular, one running parsoid can service multiple different mediawikis. They are distinguished by the $wgVisualEditorParsoidPrefix and corresponding localsettings.js. For example, you should be able to visit your running parsoid instance on http://localhost:8080 (or whatever port you used) and then get different pages for:

http://localhost:8080/enwiki/Main_Page (links to wikipedia using 'enwiki' prefix)
http://localhost:8080/localhost/Main_Page
http://localhost:8080/foo/Main_Page

given

parsoidConfig.setInterwiki( 'localhost', 'some api endpoint' );
parsoidConfig.setInterwiki( 'foo', 'some different api endpoint' );

Our docs could probably explain this better. Suggestions welcome.

jazeckel wrote:

(In reply to C. Scott Ananian from comment #16)

I'm guessing that one of these is prohibiting anonymous access to the API.
I bet it's not actually the case that it works when you use the server name,
I bet that somehow you're actually accessing the wrong wiki in that case.
I'm not a mediawiki permissions expert though, so take what I say with a
grain of salt.

Yes, this is exactly what happens -- it loads the content from the wrong wiki.

If you are mucking with permissions, this link may be useful:
https://www.mediawiki.org/wiki/Extension:
VisualEditor#Linking_with_Parsoid_in_private_wikis

I had already set $wgVisualEditorParsoidForwardCookies = true as instructed in that portion. I didn't uncomment the line about session cacheing because it didn't seem to make a difference.

Yes, the 'localhost' part here has to match the string you set
$wgVisualEditorParsoidPrefix to.

Okay, thanks for letting me know that. It wasn't clear from the installation/configuration instructions.

(In reply to C. Scott Ananian from comment #18)

In particular, one running parsoid can service multiple different
mediawikis. They are distinguished by the $wgVisualEditorParsoidPrefix and
corresponding localsettings.js. For example, you should be able to visit
your running parsoid instance on http://localhost:8080 (or whatever port you
used) and then get different pages for:

http://localhost:8080/enwiki/Main_Page (links to wikipedia using 'enwiki'
prefix)
http://localhost:8080/localhost/Main_Page
http://localhost:8080/foo/Main_Page

given

parsoidConfig.setInterwiki( 'localhost', 'some api endpoint' );
parsoidConfig.setInterwiki( 'foo', 'some different api endpoint' );

Our docs could probably explain this better. Suggestions welcome.

Thanks, that explains some more. I have set them both to "auc" and now it's not loading data from the wrong wiki anymore, but it's timing out again. Perhaps there is some other detail of configuration for a private wiki that is causing a problem. Can I go into Parsoid's code and hard code a username and password and make a private account just for Parsoid to resolve this?

jazeckel wrote:

FWIW if I go to http://localhost:8000/auc/Main_Page from a browser where my login credentials are cached, the content will load. If I'm logged out of the wiki in that browser, I will get an error.

jazeckel wrote:

Okay, I finally have it working. I think this boils down to configuration errors so the instructions for setup for Parsoid and the VisualEditor should probably be clarified, but also it would be nice to improve the error messaging so that a permissions error doesn't cause a timeout.

Here's what I did to fix it:

  • made the key in parsoidConfig.setInterwiki equal to $wgVisualEditorParsoidPrefix, both of which I set to "auc" (rather than "localhost/auc")
  • set $wgSessionsInObjectCache = true (this is the line that didn't seem to have any effect before, but that was probably because the rest was not configured right)
  • set $wgVisualEditorParsoidForwardCookies = true (I already had that one set)

Thanks for your help!

(In reply to C. Scott Ananian from comment #16)

gwicke: we should probably not be timing out if the api request returns an
error. Maybe we should look into this?

My understanding of the code is that we only retry when getting an error object, but not if we got an HTTP status code. There's either no proper HTTP status returned by this particular wiki setup, or the request library returns an error when it should not.

I just tested this locally, and there's no retrying on unauthorized requests. This means that this is probably an IIS-specific issue, where IIS does not return a proper HTTP status on authentication errors.