Page MenuHomePhabricator

Make all Event Logging instrumentation target the same Schema:Edit revision
Closed, ResolvedPublic8 Estimated Story Points

Description

Currently some Wikitext events go into Edit_11319708 and others go into Edit_11448630. With different sampling rates, this is a recipe for bad data. More detail in my comment here: T93242

Event Timeline

Milimetric assigned this task to Jdforrester-WMF.
Milimetric raised the priority of this task from to Needs Triage.
Milimetric updated the task description. (Show Details)
Milimetric added a project: VisualEditor.
Milimetric subscribed.

It's not clear how this is happening at all:

1krenair@tin:/srv/mediawiki-staging/php-1.25wmf21$ grep 11319708 extensions/* -R
2krenair@tin:/srv/mediawiki-staging/php-1.25wmf21$ grep 11448630 extensions/* -R
3extensions/WikiEditor/WikiEditor.hooks.php: return EventLogging::logEvent( 'Edit', 11448630, $data );
4extensions/WikimediaEvents/WikimediaEvents.php: 'revision' => 11448630,
5krenair@tin:/srv/mediawiki-staging/php-1.25wmf21$ cd ../php-1.25wmf22
6krenair@tin:/srv/mediawiki-staging/php-1.25wmf22$ grep 11319708 extensions/* -R
7krenair@tin:/srv/mediawiki-staging/php-1.25wmf22$ grep 11448630 extensions/* -R
8extensions/WikiEditor/WikiEditor.hooks.php: return EventLogging::logEvent( 'Edit', 11448630, $data );
9extensions/WikimediaEvents/WikimediaEvents.php: 'revision' => 11448630,
10krenair@tin:/srv/mediawiki-staging/php-1.25wmf22$

Krenair added a subscriber: Krinkle.

@Krinkle has done some digging around on the ResourceLoader side of things. It appears that the cached version of the schema actually getting loaded by RL for the startup module is an old version.

Krenair set Security to None.
Krenair added a subscriber: ori.

Bypassing cache serves the new version, so it's definitely deployed correctly and being exposed:
http://bits.wikimedia.org/en.wikipedia.org/load.php?lang=en&debug=true&modules=schema.Edit&only=scripts

... ,"revision":11448630 ...

Fetching the startup module shows:

	    [
	        "schema.Edit",
	        1381493430,
new Date(1381493430* 1000)
Fri Oct 11 2013 13:10:30 GMT+0100 (BST)

Looks like that is $wgCacheEpoch. So ResourceLoaderSchemaModule is probably failing to fetch the RemoteSchema to determine the current cache v....

Wait a second, that class isn't computing an actual timestamp. It's doing mathematical addition on cache epoch with a revision ID. Thus creating an incrementing timestamp, but not actually related to the time the change happened.

This works on an individual module but will fail in practice where multiple modules are requested in a batch with the max() timestamp as the leading url parameter. Thus deploying a change to EventLogging, I suspect, would never result in successful deployment of said change unless another module requested with that schema is also changed in the same deployment.

Milimetric claimed this task.

The instrumentation from the Edit team seems fine here. The problem is with EL itself, so I've opened an appropriate task and I'll start working on it with top priority.