Performance review of Wikidata Bridge
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	darthmon_wmde
	Feb 28 2020, 4:22 PM

Description

The Wikidata Bridge (formerly known as “client editing”) is a project aiming to make it possible to edit Wikidata’s data directly from Wikipedia. This will be achieved by an interface, connected to the infobox, that users can access directly from their local wiki. The development will be made in several phases, working together with the communities to understand their needs and build a tool that will connect the data and the communities in a better way.

Preview environment

https://en.wikipedia.beta.wmflabs.org/wiki/Wikidata_Bridge_Showcase
https://en.wikipedia.beta.wmflabs.org/wiki/Data_bridge

Which code to review

(Provide links to all proposed changes and/or repositories. It should also describe changes which have not yet been merged or deployed but are planned prior to deployment. E.g. production Puppet, wmf config, or in-flight features expected to complete prior to launch date, etc.).

Everything in the client/data-bridge directory in the Wikibase extension (the file paths below are relative to this directory

The modules that contain data-bridge or DataBridge in client/resources/Resources.php in the Wikibase extension
- Including the contents of client/includes/DataBridge/

Config changes (git log -G DataBridge):
- Enable DataBridge on Beta
- bridge: enable EditTags for beta
- Rename data bridge config variable names
- Set dataBridgeEnabled repo setting on beta
- Fix typo in beta repo data bridge config
- Upcoming: Set dataBridgeEnabled repo setting on Wikidata
- Upcoming: Set dataBridgeEnabled client setting on certain client wikis
- Upcoming: Set dataBridgeEnabled client setting on all Wikibase clients

Performance assessment

Please initiate the performance assessment by answering the below:

What work has been done to ensure the best possible performance of the feature?

a) Splitting the code to be loaded into a thin wrapper/module “init” that checks if there is actually a bridge enabled link on the page and a second, conditionally and lazily loaded module “app” containing the actual application:

the “init” module is built for target app
- Receives browser compatibility support, tree shaking, minification courtesy of vue cli
the “app” module is built as library (commonjs) to allow for runtime dispatching by other code (“init” module)
- The vue dependency is externalized
- Currently not tree-shaken or minified (see “potential optimisations”)

b) Combining API requests as much as possible (code)

c) Load items from Special:EntityData rather than via the API (amendment T240223)

There is a trade-off here. The output of Special:EntityData is cached in Varnish, reducing load on the app servers; on the other hand, with the wbgetentities API it is possible to limit the amount of data one receives (e. g. “I only need statements, not labels or sitelinks”). See also the related bullet point of the “weak areas” section. (Note, however, that wbgetentities currently does not support limiting the data to only statements for a certain property.)

What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?

It needs a thin wrapper to check whether there is a bridge-enabled link on the page dist/data-bridge.init.js which is run, in the client side, on every article impression
There is a time delay between the page being interactable and our init-code being called. That code attaches link listeners to Bridge-enabled links that disable the default behavior, i.e. the linking to Wikidata. In this time delay, if the user clicks a bridge enabled link expecting to see the bridge app, they instead get send to Wikidata
It loads, when “app” is dispatched, the entire entity from Wikidata, which might be quite large, but uses only the statements of a single property

Are there potential optimisations that haven't been performed yet?

Reducing the size of the dist/data-bridge.common.js file/module that is loaded if a bridge-enabled link is detected on the page T228857
- One possible idea would be to use uglify.js in the build step (mediawiki/extensions/Wikibase/+/571482)
  - Optional: allow debug=true (cf.) for unminified version
- Apply tree shaking
- Externalize more dependencies shared with other micro frontends (e.g. vue-class-component - also used in termbox)
Reducing the data actually transmitted when saving by sending only the statement(s) for the actually edited property instead of all statements for all properties T230343
Using svgo via cssnano to further decrease the size of our assets T234070
Enable storing ResourceLoader modules on Firefox LocalStorage again once the next gen LocalStorage is enabled and stable in Firefox mediawiki/core/+/544183
Reuse vue ResourceLoader module from MediaWiki instead of shipping our own copy (T247519)
Reuse vuex ResourceLoader module from MediaWiki instead of shipping our own copy (T250264)
Explore differential loading to further reduce payload sizes for modern browsers (again, the difference is in the build step, not the implementation)
Determine and integrate a performance budget into project CI

Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far.

There are two performance related measurements in place so far, both for the performance of the initialization step:
- The time from the MediaWiki performance mark mwStartup to the link listeners being attached and the bridge ready to open. (This is related to the user following the link unintentionally to wikidata if they click too fast, as mentioned in the “weak areas” section)
- The time from the user clicking on a bridge-enabled link to the bridge app actually opening for them.
- Both can be seen in beta at: https://grafana-labs-admin.wikimedia.org/d/000000020/wikidata-and-base-on-labs?refresh=5m&orgId=1&from=now-7d&to=now
Calls to Special:EntityData (used by Data Bridge to load the entity being edited) should be virtually unaffected – we do not expect Bridge requests to make up a significant volume of these, compared to the many requests that are already being made.
Calls to wbgetentities (used by Data Bridge to load the label of the property being edited, in content language) should likewise be virtually unaffected.
Wikidata edits are also not expected to change significantly, seeing as most of the edits come from various automated and semi-automated systems (including bulk editing tools on behalf of editors).
We offer a shy CTA for the user to edit on the local wiki under some circumstances (T235753) but do not expect this to have measurable impact on the number of edits performed

Related Objects
Search...

Status	Assigned	Task
Resolved	aaron	T246456 Performance review of Wikidata Bridge
Invalid	None	T248840 Don’t polyfill Promise in Bridge modules
Resolved	Lucas_Werkmeister_WMDE	T250264 Bridge: Use "vuex" ResourceLoader module from core

Event Timeline

darthmon_wmde created this task.Feb 28 2020, 4:22 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 28 2020, 4:22 PM

Aklapper added a project: Wikidata-Bridge.Mar 1 2020, 5:41 PM

Restricted Application added a project: Wikidata. · View Herald TranscriptMar 1 2020, 5:41 PM

Lydia_Pintscher merged a task: T234847: Step 1: Performance review for Wikidata Bridge.Mar 1 2020, 5:43 PM

Lydia_Pintscher added subscribers: Krinkle, Addshore, Lydia_Pintscher.

• Pablo-WMDE updated the task description. (Show Details)Mar 12 2020, 8:06 PM

• Pablo-WMDE updated the task description. (Show Details)Mar 12 2020, 8:08 PM

• Pablo-WMDE mentioned this in T247519: Bridge: Use "vue" ResourceLoader module from core.Mar 13 2020, 8:37 AM

• Gilles assigned this task to aaron.Mar 30 2020, 7:49 PM

• Gilles moved this task from Inbox, needs triage to Doing (old) on the Performance-Team board.Mar 30 2020, 7:52 PM

Jakob_WMDE subscribed.Apr 1 2020, 11:45 AM

Michael mentioned this in T232584: Step 1: Production deployment checklist.Apr 1 2020, 1:45 PM

Krinkle added a subtask: T248840: Don’t polyfill Promise in Bridge modules.Apr 15 2020, 12:22 AM

Lucas_Werkmeister_WMDE added a subtask: T250264: Bridge: Use "vuex" ResourceLoader module from core.Apr 15 2020, 12:02 PM

Lucas_Werkmeister_WMDE closed subtask T250264: Bridge: Use "vuex" ResourceLoader module from core as Resolved.Apr 21 2020, 2:21 PM

• Pablo-WMDE updated the task description. (Show Details)Apr 22 2020, 8:04 AM

• Pablo-WMDE updated the task description. (Show Details)Apr 22 2020, 8:08 AM

hey @aaron, @Gilles,

could you, please, give us an update on this task? could you also tell us something we could tackle proactively that may help?

We are currently implementing the last stories and that information would help us enormously to shape our next steps.

Thanks a lot in advance!

In T246456#6116523, @darthmon_wmde wrote:

hey @aaron, @Gilles,

could you, please, give us an update on this task? could you also tell us something we could tackle proactively that may help?

We are currently implementing the last stories and that information would help us enormously to shape our next steps.

Thanks a lot in advance!

Sorry, I was reading through the documentation before a 1 week vacation, and then getting distracted by debugging T249069 and coordinating on T236414 and T250205 related issues. Getting back to this is my next priority after fixing T249069.

In T246456#6118897, @aaron wrote:

In T246456#6116523, @darthmon_wmde wrote:

hey @aaron, @Gilles,

could you, please, give us an update on this task? could you also tell us something we could tackle proactively that may help?

We are currently implementing the last stories and that information would help us enormously to shape our next steps.

Thanks a lot in advance!

Sorry, I was reading through the documentation before a 1 week vacation, and then getting distracted by debugging T249069 and coordinating on T236414 and T250205 related issues. Getting back to this is my next priority after fixing T249069.

Understood. Thanks a lot!

darthmon_wmde mentioned this in T249039: Security Readiness Review For Wikidata Bridge.May 12 2020, 10:56 AM

Lucas_Werkmeister_WMDE changed the status of subtask T248840: Don’t polyfill Promise in Bridge modules from Open to Stalled.May 26 2020, 9:15 AM

I've been looking at this from time to time, and haven't found anything real problems yet. Some of the things I'm looking out for are:

Pageview critical path effects:
- Bytes (JS)
- Bytes (CSS+images)
- Page load delay
- First input delay
- Content reflows
Post-load pageview interaction effects:
- Hover delays
- Input delays
Backend effects
- DB I/O usage
- DB contention issues
- Search index store usage
- Key/value store usage
- Cache usage

I'll be looking at the frontend some more this weekend, but I expect to sign-off as "LGTM" monday.

From the perspective of popular/major articles, likely to have infoboxes, the extra 42.1 KB for loading the "app" JS doesn't seem crazy. I've looked through code several times and it seems reasonable. Testing with fast/slow 3G doesn't reveal obnoxious reflows or delay either. Having the edit link go directly to a Q<X> page when the JS hasn't fully loaded felt somewhat jarring, though I don't image that happening often. I don't see much editing at all given how discrete the icon is (a good thing).

The bootstrapping "init" JS is also pretty tiny. It does have a fair number of module references in the using() call for pages with editable elements. OTOH, those seem to be loaded anyway, with the "app" being the only new thing triggered. The DOM search for editable entity links is just a simple CSS selector call with reasonable metadata extraction. I don't see (nor did I perceive) any CPU use or long task issue there.

The client <=> api.php layer looks reasonable and well abstracted. Given the low-key nature of the GUI, I don't foresee any obvious edit rate, DB overhead, nor contention issues.

I don't see any reason to block the wikidata-bridge deployment and consider this task resolved from my end.

In T246456#6183741, @aaron wrote:

From the perspective of popular/major articles, likely to have infoboxes, the extra 42.1 KB for loading the "app" JS doesn't seem crazy. I've looked through code several times and it seems reasonable. Testing with fast/slow 3G doesn't reveal obnoxious reflows or delay either. Having the edit link go directly to a Q<X> page when the JS hasn't fully loaded felt somewhat jarring, though I don't image that happening often. I don't see much editing at all given how discrete the icon is (a good thing).

The bootstrapping "init" JS is also pretty tiny. It does have a fair number of module references in the using() call for pages with editable elements. OTOH, those seem to be loaded anyway, with the "app" being the only new thing triggered. The DOM search for editable entity links is just a simple CSS selector call with reasonable metadata extraction. I don't see (nor did I perceive) any CPU use or long task issue there.

The client <=> api.php layer looks reasonable and well abstracted. Given the low-key nature of the GUI, I don't foresee any obvious edit rate, DB overhead, nor contention issues.

I don't see any reason to block the wikidata-bridge deployment and consider this task resolved from my end.

That sounds awesome! thanks a lot for the assessment :)

• Gilles closed this task as Resolved.Jun 2 2020, 1:41 PM

Lucas_Werkmeister_WMDE changed the status of subtask T248840: Don’t polyfill Promise in Bridge modules from Stalled to Open.Jun 12 2020, 11:37 AM

Michael closed subtask T248840: Don’t polyfill Promise in Bridge modules as Invalid.Jan 18 2022, 12:01 PM

Performance review of Wikidata BridgeClosed, ResolvedPublicActions

Description

Description

Preview environment

Which code to review

Performance assessment

Related ObjectsSearch...

Event Timeline

Performance review of Wikidata Bridge
Closed, ResolvedPublic
Actions

Related Objects
Search...