Page MenuHomePhabricator

Edit Schema module loaded by EL client side is not being updated
Closed, ResolvedPublic

Description

The current Edit instrumentation is asking for schema 11448630 but getting 11319708 instead. This is causing events to go to the wrong place.

Event Timeline

Milimetric claimed this task.
Milimetric raised the priority of this task from to Unbreak Now!.
Milimetric updated the task description. (Show Details)
Milimetric added a subscriber: Milimetric.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 26 2015, 6:26 PM
Milimetric added a comment.EditedMar 26 2015, 6:32 PM

Bypassing cache serves the new version, so it's definitely deployed correctly and being exposed:
http://bits.wikimedia.org/en.wikipedia.org/load.php?lang=en&debug=true&modules=schema.Edit&only=scripts

... ,"revision":11448630 ...

Fetching the startup module shows:

	    [
	        "schema.Edit",
	        1381493430,
new Date(1381493430* 1000)
Fri Oct 11 2013 13:10:30 GMT+0100 (BST)

Looks like that is $wgCacheEpoch. So ResourceLoaderSchemaModule is probably failing to fetch the RemoteSchema to determine the current cache v....
Wait a second, that class isn't computing an actual timestamp. It's doing mathematical addition on cache epoch with a revision ID. Thus creating an incrementing timestamp, but not actually related to the time the change happened.
This works on an individual module but will fail in practice where multiple modules are requested in a batch with the max() timestamp as the leading url parameter. Thus deploying a change to EventLogging, I suspect, would never result in successful deployment of said change unless another module requested with that schema is also changed in the same deployment.

The events coming into the system are from both schemas, numbers for today:

nuria@vanadium:~/VE-events-march-26$ wc -l Edit_events.txt
2427409 Edit_events.txt
nuria@vanadium:~/VE-events-march-26$ more Edit_events.txt | grep 11448630 | wc -l
1410050

So, as you can see this points to a client side caching issue in SOME clients, others are retrieving your last schema just fine.

I really do not think this ticket belongs in analytics. Probably @Krenair can shed lite into caching issues with mw code.

Nuria added a comment.Mar 26 2015, 7:11 PM

Or it could also be your newest code is not yet deploed everywhere and it is (lawfully) requesting the older schema, let me do some greps on vanadium data.

See @Krinkle's comment. It may be that the correct schema numbers are all coming from the server-side logging, which naturally is not affected by ResourceLoader/EventLogging interactions.

Nuria added a comment.Mar 26 2015, 7:42 PM

Never mind my two last comments there, looks like the issues with clienbt side caching are dependent on the js request a user has done, thus if requesting the Schema with newer modules that the user doesn't have the right schema is returned.

Nuria added a comment.Mar 26 2015, 7:51 PM

See @Krinkle's comment. It may be that the correct schema numbers are all coming from the server-side logging,

No, no, those are from client side only

Krinkle claimed this task.Mar 26 2015, 8:11 PM
Krinkle set Security to None.

Change 199986 had a related patch set uploaded (by Krinkle):
api: Send Last-Modified header with revision timestamp

https://gerrit.wikimedia.org/r/199986

Change 200009 had a related patch set uploaded (by Krinkle):
[WIP] RemoteSchema: Expose timestamp from API

https://gerrit.wikimedia.org/r/200009

Change 200028 had a related patch set uploaded (by Krinkle):
ResourceLoaderSchemaModule: Use definition hash instead of fake timestamp

https://gerrit.wikimedia.org/r/200028

Change 200034 had a related patch set uploaded (by Jforrester):
ResourceLoaderSchemaModule: Use definition hash instead of fake timestamp

https://gerrit.wikimedia.org/r/200034

Change 200035 had a related patch set uploaded (by Jforrester):
ResourceLoaderSchemaModule: Use definition hash instead of fake timestamp

https://gerrit.wikimedia.org/r/200035

Change 200028 merged by jenkins-bot:
ResourceLoaderSchemaModule: Use definition hash instead of fake timestamp

https://gerrit.wikimedia.org/r/200028

Krinkle closed this task as Resolved.Mar 26 2015, 10:16 PM
Krinkle removed a project: Patch-For-Review.
Krinkle removed a subscriber: gerritbot.

Change 200034 merged by jenkins-bot:
ResourceLoaderSchemaModule: Use definition hash instead of fake timestamp

https://gerrit.wikimedia.org/r/200034

Change 200035 merged by jenkins-bot:
ResourceLoaderSchemaModule: Use definition hash instead of fake timestamp

https://gerrit.wikimedia.org/r/200035

kevinator moved this task from Next Up to Done on the Analytics-Kanban board.Mar 27 2015, 2:27 PM

Are the events going into the correct table etc. now?

Yes! Thanks very much everyone. I'm running the queries against wikitext data for the first time now, pretty exciting.

Change 199986 merged by jenkins-bot:
api: Send Last-Modified header with revision timestamp

https://gerrit.wikimedia.org/r/199986