Page MenuHomePhabricator

Performance review of Wikistories
Open, MediumPublic

Description

Description

The Wikistories MediaWiki extension introduces visual stories as a new contribution format in a new namespace. A story is a list of frames, each with an image, text, references, and metadata. They are stored as page/revisions using a new StoryContent/StoryContentHandler.

Stories are created, curated, and viewed using Vuejs apps with minimal no-js fallbacks.

Preview environment

On the beta cluster

Which code to review

The extension repo is https://gerrit.wikimedia.org/r/q/project:mediawiki%252Fextensions%252FWikistories

Performance assessment

Please initiate the performance assessment by answering the below:

  • What work has been done to ensure the best possible performance of the feature?
    • Tried to keep the 'discover' module as light as possible since it runs on article views. Building HTML using jquery instead of loading Vuejs. Load the 'viewer' module dynamically later.
    • Cache the rendered stories for each article in one bundle so the rest API return as quickly as possible.
  • What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?
    • RL modules breakdown and content.
    • Interaction with the Commons image search API
  • Are there potential optimisations that haven't been performed yet?
    • We plan on caching the image search results and article content in the Story Builder so they don't reload every time the user navigates to those pages.
    • We talked about a smarter cache invalidation approach when a new story is associated to an article but we don't have any plan about it.
  • Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the Performance Team for advice: performance-team@wikimedia.org.
    • There's no measurement in place but we are quite concerned about the time it takes to load the stories thumbnails below the title on article pages. We would welcome ideas to track that time and hopefully speed it up.

Event Timeline

SBisson edited projects, added Wikistories (MVP); removed Wikistories.

Hi @Krinkle and Performance team, the Wikistories extension is ready for performance review. We hope to deploy to idwiki in June. Having at least an initial performance sanity check before that would be greatly appreciated.

Hello @Krinkle and performance team.. following up on this request.

Let us discuss it in our next team meeting happening the coming week.

There's no measurement in place but we are quite concerned about the time it takes to load the stories thumbnails below the title on article pages. We would welcome ideas to track that time and hopefully speed it up.

For browser that supports it (at the moment Chrome/Edge) you can use the Element Timing API. If you can add an attribute to the element you want to measure then we can pick that up in the future in the navigation timing extension (that we use to measure metrics from real users). That attribute will also make our synthetic tools measure it automatically (when something painted on screen) but we need to add a URL/user journey for our tools where we test Wikistories. That way you can see the diff between the article paint and the thumbnails and take decision if its slow or not.

When you view a story I would check that the Google Web Vitals are ok, things like Cumulative layout shift and Largest contentful paint, just to make sure it do not increase those metrics.

For browser that supports it (at the moment Chrome/Edge) you can use the Element Timing API. If you can add an attribute to the element you want to measure then we can pick that up in the future in the navigation timing extension (that we use to measure metrics from real users). That attribute will also make our synthetic tools measure it automatically (when something painted on screen) but we need to add a URL/user journey for our tools where we test Wikistories. That way you can see the diff between the article paint and the thumbnails and take decision if its slow or not.

This sounds like a very good option. If I understand correctly, we'll add elementtiming="<something>" attributes to the elements we really care about (there's 1 or 2 atm) and minimum configuration on your side (URL/user journey) and we'll get those metrics from production. We'll do that soon. We are currently reworking our loading sequence to make sure there is no layout shift in any case. That's a red flag we saw on the performance dev tools.

@Peter / @Krinkle we're currently reworking the loading process of stories on article pages and we don't really like any of the solutions we looked at. Wondering if you have any recommendations.

Basically, we show thumbnails for the stories attached to an article just below the title. (See screenshot below)

localhost_8080_w_index.php_title=Boat&wikistories=1.png (898×650 px, 96 KB)

This is how it's currently done:

  1. The BeforePageDisplayMobile adds the wikistories.discover module when needed
  2. The discover module fetches the stories from the API
  3. With the API response, the discover module adds the stories, along with a "create" button, to the page just below the article title. This causes the rest of the page to shift down.

We have a patch to change the logic in the following way:

  1. The BeforePageDisplayMobile adds the wikistories.discover module when needed
  2. The BeforePageDisplayMobile adds the wikistories.discover-styles module when needed. This module uses a margin-bottom on the main h1 to reserve 96px for the stories section.
  3. The rest is more or less the same: when the API returns, the discover module replaces the margin-bottom with the stories and there is no content shift.

We considered including the stories to the article page on the server but we decided not to for the following reasons:

  1. We couldn't find a hook to add exactly below the title. One can be created but it is extra work.
  2. It would slow down the initial rendering of the article content.
  3. It would require us to invalidate the cached article more often to account for stories changing.

Note: This is (mostly) not yet a performance review. I'm getting a sense of how it works and sharing some questions and general impressions along the way.

Enabling the feature

It seems the instructions are no longer correct as the example URL in the task descriptions (with wikistories=1) does not result in a page view with Wikistories loaded on it.

Looking through some recent commits on Gerrit, I find change 810350 which, while not the change that changes how this works, it does mention a beta feature. I assume this means it is now implicitly logged-in only.

Unfortunately, after logging-in from the m-dot domain for MobileFrontend, one ends up on a Session error page on a canonical/desktop URL. Manually changing back to m-dot does however lead to a logged-in view.

Unfortunately, MobileFrontend, despite being deployed and enabled by default on mobile for nearly ten years now, still doesn't feature as much as a link to Special:Preferences for logged-in (at least not as configured in Prod and Beta Cluster for enwiki). And even its non-standard "Mobile Settings" page doesn't extend or refer to it.

Changing the beta preference from desktop, and then switching to back to m-dot https://en.m.wikipedia.beta.wmflabs.org/wiki/Apollo_12 does lead to a page with wikistories enabled.

Bundle registration

I'm trying to locate the module loader instruction, but on view-source there are no matches for wikistories besides in mw.config and mw.user.options. Iterating through all ext.* module names from the console, I found no related registrations. After checking extension.json in the repository, I noticed the module is named mw.ext.story, thus under the bundle prefix for MediaWiki core instead of extensions, and not containing the extension name.

NOTE: I suggest renaming the bundles to ext.wikistories.* for consistency and improved discovery.

This will help with automated integrations and metric reporting in Grafana, and various other workflows and automations. E.g. habits like the one I describe above. Or for example using Quick Open in a text editor and navigating to ext.wikistories … quickly from anywhere else in the production code base.

This could be a minor oversight or typo of course, but it is my understanding that this is the team's first production extension. As such, I'll mention that the following pages are recommended reading material, if you haven't already! Consider fitting the code to any recommendations and general practices described here. We consider these guidelines (as opposed to policies), which means basically that locally optimising something to differ from the guideline in a few cases is generally fine — if it directly improves UX or local code quality.

This last guideline expects three or fewer bundles for most new extensions. Looking at the repository's resources/ directory, I see four module-like subdirectories, which looks like it's in pretty good shape. Presumably:

  • One stylesheet for the subset of styles needed on pageviews that have wikistories, for server-rendered interface.
  • One minimal JS bundle for interactivity on those page views.
  • One for everything else, such as any functionality on the Story and Special pages.

Looking at extension.json though, I see there are actually 6 bundles registered.

Quick review
  1. mw.ext.story.viewer-nojs:

To estimate the size of this one one, I loaded it from load.php?module=:name&only=styles. However starting from the mw.ext.wikistories.viewer-nojs/ subdirectory this leads to an Unknown module error as the actual name is spelled in yet another way.

In any event, it contains about 0.4KB of raw CSS, which when batched with other extensions' CSS modules compresses down to 0.2KB or less (e.g measure difference in size of a batch request with and without |:name at the end). Registering something as a separate bundle is an optimisation ("bundle splitting") to consider in balance with the bigger picture (added site-wide cost, and potentially block every kind of interaction on loading missing code vs loading slightly more in advance and having it cached).

This particular one is used for the Story: pages I believe. I wasn't able to navigate to such page directly from either the pageview or builder interface. Perhaps there's a way, but this probably means it's not a Tier 1 destination but on par with the builder or less common than that. Perhaps it can be combined with the discovery or builder styles into a single ext.wikistories.styles module to load in all three contexts, or as part of ext.wikistories.builder.styles to then load in both Story page and Builder context, both of which presumably have plenty of budget left for a small increase in both directions.

  1. mw.ext.story.viewer

I understand this is a currently-needed fourth module so as to defer the Vue and components payload until interaction, lazy-loaded from the discovery module. I wonder whether the discover module would still be needed in addition to this if we had something like T183720. In that case, the server-rendered elements would have a declarative attribute (taken care of by MW core) that will load Vue and the ext.wikistories.viewer module if one of the elements is clicked.

  1. mw.ext.story.builder.styles

This module might be an easy one to consolidate with another. If I understand correctly, it is only relevant on e.g. Special:StoryBuilder/Apollo_12 and contains one line of code, to toggle an element based on .client-js. That one line can perhaps be merged into the same destination as mw.ext.story.viewer-nojs.

  1. mw.ext.story.builder

LGTM.

  1. mw.ext.story.discover

It appears wgTitle is used incorrectly. To construct mw.Title and thus avoid misinterpreted values on drafts and other pages outside the main namespace, use wgPageName instead. (docs).

The existence of convertUrlToThumbnail() looks suspicious. Afaik there are not new kinds of thumbnails in the Wikistories backend that mw.util.parseImageUrl() couldn't handle. If you found a bug, please report it and let me know so others can improve it and/or help use it correctly in case it supports it already.

Looking at wgWikistoriesCreateUrl — presumably the reason the JS isn't using mw.Title to construct this URL from scratch, is because we haven't yet given mw.Title the ability to format localised special page URLs without incurring a redirect. Nicely done to avoid that redirect :)

Perhaps we can avoid the payload in mw.config, though, by bundling the name of that special page and still use mw.Title to format the URL to its subpage.

Looking at consumptionEvents.js — I'm confused by new mw.Uri().host. I think you might be looking for the location.host built-in. No need to parse location.href generally speaking as the location object itself is essentially an instance of new URL(location.href). Perhaps there's something subtle here that I'm missed, though. Looking forward to find out :-)

In the same file, the use of mw.eventLog looks unstable. I see no mechanism or declaration to ensure this module has been loaded and executed before your code executes. The EventLogging extension is meant to be dependency-free, lightweight and no-op by default. Thus you can and should depend on it explicitly if you need to call its methods. Either via extension.json or via mw.loader.using(). For production, we load it on all page views by default and it's optimised to be small and require zero configuration for third-party wikis and dev wikis. We generally don't bother with extra conditional code or runtime lazy-load code, and instead add EventLogging among dependencies to ensure our code executes it in the right order. We can then safely call its interface.

In the same file:

	activity_session_id: mw.eventLog.id.getSessionId(),
	pageview_id: mw.user.getPageviewToken(),
	page_title: mw.config.get( 'wgTitle' )

This is a lot of information to log and may be problematic in terms of privacy. Afaik it's unprecedented at WMF, and unlikely to be needed for a concrete product question, to associate literal article names that people read with an identifier such as a browser session. Especially considering other instrumentions share those same identifiers. Likewise, we probably shouldn't be logging the pageview token in association with the page title, as that kind of defeats the purpose of that token.

While I may be wrong, this gives the impression of "log everything and sort it out later". My understanding of our transparancy and privacy are such that we generally lead with a question first (made public via Phab or in the commit message), prior to shipping instrumentation. This also derisks impact of data leak, and increases chances of statistically correct data use.

The const $discover = getDiscoverSection().insertAfter( '.page-heading' ); statement looks unsafe (dom-ready might not have fired yet), as well as suboptimal in terms of perf as its doing notable DOM manipulation during the main JS response batch where all core and extension module exports are being resolved. Given the code is loaded async and may run after first paint, I assume this can be deferred further, both to ensure faster module resolution, but also to ensure that the target actually exists by then. In other words, wait for dom-ready first. In the common case, "waiting" for dom-ready will not noticably delay this code, it only re-orders it such that we first resolve all module exports, and then after optional breathing room to ensure device responsiveness, immediately the already-resolved dom-ready callbacks run. This also reduces chance of duplicate DOM and paint costs by avoiding read-after-write before the browser would have "naturally" incurred these costs outside the critical path.

In getStories.js, data is transformed from an internal REST API, called from one place, to add id properties. This .id property appears unused (ran a basic search, I might've missed it?). If needed, perhaps it could be served in the needed format directly by the REST API to avoid running the extra transformation (and shipping the code for performing that transformation).

As a general tip, If non-Vue code feels complex, it can sometimes be simpler to write HTML as a string literal (with e.g. mw.html to escape as-needed), rather than using jQuery.

  1. mw.ext.story.discover.styles

[…]

  • We couldn't find a hook to add exactly below the title. One can be created but it is extra work.

There might be prior art from Wikivoyage that we can leverage, e.g. check https://en.wikivoyage.org/wiki/Belzoni and https://en.m.wikivoyage.org/wiki/Belzoni, and how it modifies the Minerva page header.

Alternatively, if it can be below the page action tabs, there are plenty of ways to insert something server-side atop page content. Inserting there might fit better in the overall MediaWiki model, as these stories are associated with the page view, and e.g. would not remain when on the talk page or history action, hence part of the "tab".

See also https://www.mediawiki.org/wiki/Help:Page_status_indicators for a similar concept we have already, for small badges associated with the content and displayed atop the page.

  • It would slow down the initial rendering of the article content.

The Wikistories REST API accesses the information from an indexed database table with Memcache in front. This looks good! I believe that would generally respond < 1ms in production. Beyond that there is only string concatenation, which is immeasurable. I expect no significant impact from adding this to a backend page view hook. Even more so with CDN caching in front, if we get to unregistered users.

  • It would require us to invalidate the cached article more often to account for stories changing.

If the story pages are attached to articles with MCR and connected by templatelinks, then this will happen automatically when the Story content model is internally edited. This would also have the benefit of naturally accomodating abuse and quality control by manifesting the addition and removal of such Story-connections with a watchlist/RC/pagehistory directly on the article where it visually appears.

Lack of this integration and notification played a major role in the initial (negative) reception of Wikidata on Wikipedia — until such integration was developed.

Alternatively, to keep the connection the other way around as it is now, you'd only need to call Title->invalidateCache() from StoryContentHandler::getSecondaryDataUpdates.

So long as invalidations only happen when humans (or bots) make substantial changes to a Story, the CDN purge rate won't be a problem. We already process thousands of purges per second, and the CDN generally doesn't hold on to even popular pages for more than 24 hours, so invalidating often is no problem.

I'll also mention that we have Mustache available for easy templating on the server-side.

[…]

  • It would require us to invalidate the cached article more often to account for stories changing.

If the story pages are attached to articles with MCR and connected by templatelinks, then this will happen automatically when the Story content model is internally edited. This would also have the benefit of naturally accomodating abuse and quality control by manifesting the addition and removal of such Story-connections with a watchlist/RC/pagehistory directly on the article where it visually appears.

Lack of this integration and notification played a major role in the initial (negative) reception of Wikidata on Wikipedia — until such integration was developed.

A few additional findings around this:

  • The images in question do not show up in "What links here". E.g. on the file page in Beta Cluster, it says this file is not used anywhere, but it is in fact used on the Story page and on the article page that displays the story. Without this, we will likely suffer issues from images being deleted as appearing to be unused, as well as decreased confidence/trust in link APIs being reliable.
  • A step further, we have GlobalUsage which propagates that data across wikis to the central Commons repository for further monitoring and automation, in particular CommonsDelinker and things like "universal replace" which take care of removing or updating thumbnails and items in galleries as-needed after copyright issues are identified.
  • Similarly, the story page itself is also currently reported as orphaned page which with association from the article might be avoided. This also ensures you can't have stories linking to a page that has since been deleted, as deleting a page would then naturally remove its outgoing links. And, if stories may be attached to multiple articles, this mechanism can help the community to identify orphaned/abandoned stories.
Krinkle triaged this task as Medium priority.Jul 7 2022, 12:32 AM

Change 812926 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/Wikistories@master] Discover: use wgPageName instead of wgTitle

https://gerrit.wikimedia.org/r/812926

Change 812930 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/Wikistories@master] Instrumentation: use location.host instead of new mw.Uri().host

https://gerrit.wikimedia.org/r/812930

Change 812931 had a related patch set uploaded (by Sbisson; author: Sbisson):

[mediawiki/extensions/Wikistories@master] Register explicit dependency to ext.eventLogging

https://gerrit.wikimedia.org/r/812931

Change 812930 merged by jenkins-bot:

[mediawiki/extensions/Wikistories@master] Instrumentation: use location.host instead of new mw.Uri().host

https://gerrit.wikimedia.org/r/812930

Enabling the feature

[...]

The task description was indeed out of date. I've updated it. We've decided to hide it behind a beta feature at the last minute to make the first release a bit more quiet. We are aware of the awkwardness of enabling the beta feature in desktop mode then using the extension in mobile mode but since this will be done as part of a well-guided editing campaign it was deemed acceptable.

Bundle registration

[...]

NOTE: I suggest renaming the bundles to ext.wikistories.* for consistency and improved discovery.

We'll rename the modules. See T312802: Rename RL modules to ext.wikistories.*

This last guideline expects three or fewer bundles for most new extensions. Looking at the repository's resources/ directory, I see four module-like subdirectories, which looks like it's in pretty good shape. Presumably:

  • One stylesheet for the subset of styles needed on pageviews that have wikistories, for server-rendered interface.
  • One minimal JS bundle for interactivity on those page views.

This is actually 2 modules: "discover" and "viewer". The goal is to make "discover" load and execute as fast as possible and let "viewer" load later with its vuejs dependency. If you don't think it makes a real difference I'm happy to combine them.

  • One for everything else, such as any functionality on the Story and Special pages.

Looking at extension.json though, I see there are actually 6 bundles registered.

Quick review
  1. mw.ext.story.viewer-nojs:

To estimate the size of this one one, I loaded it from load.php?module=:name&only=styles. However starting from the mw.ext.wikistories.viewer-nojs/ subdirectory this leads to an Unknown module error as the actual name is spelled in yet another way.

In any event, it contains about 0.4KB of raw CSS, which when batched with other extensions' CSS modules compresses down to 0.2KB or less (e.g measure difference in size of a batch request with and without |:name at the end). Registering something as a separate bundle is an optimisation ("bundle splitting") to consider in balance with the bigger picture (added site-wide cost, and potentially block every kind of interaction on loading missing code vs loading slightly more in advance and having it cached).

This particular one is used for the Story: pages I believe. I wasn't able to navigate to such page directly from either the pageview or builder interface. Perhaps there's a way, but this probably means it's not a Tier 1 destination but on par with the builder or less common than that. Perhaps it can be combined with the discovery or builder styles into a single ext.wikistories.styles module to load in all three contexts, or as part of ext.wikistories.builder.styles to then load in both Story page and Builder context, both of which presumably have plenty of budget left for a small increase in both directions.

OK, I see what you mean. It feels counterintuitive to us to combine resources that are never going to be used on the same page but by considering the hidden (to us) cost of over splitting then it makes sense. However, as the features grow in all these places, will it become a problem to load all of it? How would we know?

The Story pages are certainly not a tier 1 destination. They exist for the purpose of providing a stable home for the stories and show in various feeds (Log, RecentChanges, Contributions, etc). They provide a fallback no-js editing and have an associated talk page. Stories association with articles is transient (top 10 recent) and will change over time as we explore different features.

  1. mw.ext.story.viewer

I understand this is a currently-needed fourth module so as to defer the Vue and components payload until interaction, lazy-loaded from the discovery module. I wonder whether the discover module would still be needed in addition to this if we had something like T183720. In that case, the server-rendered elements would have a declarative attribute (taken care of by MW core) that will load Vue and the ext.wikistories.viewer module if one of the elements is clicked.

  1. mw.ext.story.builder.styles

This module might be an easy one to consolidate with another. If I understand correctly, it is only relevant on e.g. Special:StoryBuilder/Apollo_12 and contains one line of code, to toggle an element based on .client-js. That one line can perhaps be merged into the same destination as mw.ext.story.viewer-nojs.

Indeed.

Would a little bit of inline style on the special page be an option instead of this module or is that worst?

  1. mw.ext.story.builder

LGTM.

  1. mw.ext.story.discover

It appears wgTitle is used incorrectly. To construct mw.Title and thus avoid misinterpreted values on drafts and other pages outside the main namespace, use wgPageName instead. (docs).

Thanks. We'll change that.

The existence of convertUrlToThumbnail() looks suspicious. Afaik there are not new kinds of thumbnails in the Wikistories backend that mw.util.parseImageUrl() couldn't handle. If you found a bug, please report it and let me know so others can improve it and/or help use it correctly in case it supports it already.

According to code comment, this is to handle URLs without the last part with the size (/XXpx-). I'm trying to find which article had a meta og:image with this format.

Looking at wgWikistoriesCreateUrl — presumably the reason the JS isn't using mw.Title to construct this URL from scratch, is because we haven't yet given mw.Title the ability to format localised special page URLs without incurring a redirect. Nicely done to avoid that redirect :)

Perhaps we can avoid the payload in mw.config, though, by bundling the name of that special page and still use mw.Title to format the URL to its subpage.

Sure we can pass in the localized special page name instead of the re-constructed url. how would we specify it if not via mw.config?

Looking at consumptionEvents.js — I'm confused by new mw.Uri().host. I think you might be looking for the location.host built-in. No need to parse location.href generally speaking as the location object itself is essentially an instance of new URL(location.href). Perhaps there's something subtle here that I'm missed, though. Looking forward to find out :-)

You're right: location.host is the same but simpler. I'll change it.

In the same file, the use of mw.eventLog looks unstable. I see no mechanism or declaration to ensure this module has been loaded and executed before your code executes. The EventLogging extension is meant to be dependency-free, lightweight and no-op by default. Thus you can and should depend on it explicitly if you need to call its methods. Either via extension.json or via mw.loader.using(). For production, we load it on all page views by default and it's optimised to be small and require zero configuration for third-party wikis and dev wikis. We generally don't bother with extra conditional code or runtime lazy-load code, and instead add EventLogging among dependencies to ensure our code executes it in the right order. We can then safely call its interface.

It exits early if mw.eventLog is not defined but I see what you mean, we could be missing events if it's simply not loaded yet. I'll add it as a dependency for the modules using it.

In the same file:

	activity_session_id: mw.eventLog.id.getSessionId(),
	pageview_id: mw.user.getPageviewToken(),
	page_title: mw.config.get( 'wgTitle' )

This is a lot of information to log and may be problematic in terms of privacy. Afaik it's unprecedented at WMF, and unlikely to be needed for a concrete product question, to associate literal article names that people read with an identifier such as a browser session. Especially considering other instrumentions share those same identifiers. Likewise, we probably shouldn't be logging the pageview token in association with the page title, as that kind of defeats the purpose of that token.

While I may be wrong, this gives the impression of "log everything and sort it out later". My understanding of our transparancy and privacy are such that we generally lead with a question first (made public via Phab or in the commit message), prior to shipping instrumentation. This also derisks impact of data leak, and increases chances of statistically correct data use.

I believe that's specified in the instrumentation plan written by @nshahquinn-wmf and approved by legal. These are valid concerns but I would appreciate if the conversation can happen somewhere else.

The const $discover = getDiscoverSection().insertAfter( '.page-heading' ); statement looks unsafe (dom-ready might not have fired yet), as well as suboptimal in terms of perf as its doing notable DOM manipulation during the main JS response batch where all core and extension module exports are being resolved. Given the code is loaded async and may run after first paint, I assume this can be deferred further, both to ensure faster module resolution, but also to ensure that the target actually exists by then. In other words, wait for dom-ready first. In the common case, "waiting" for dom-ready will not noticably delay this code, it only re-orders it such that we first resolve all module exports, and then after optional breathing room to ensure device responsiveness, immediately the already-resolved dom-ready callbacks run. This also reduces chance of duplicate DOM and paint costs by avoiding read-after-write before the browser would have "naturally" incurred these costs outside the critical path.

Do I understand correctly that most things in the "discover" module (load stories via rest api, start loading the "viewer" module) can happen like they do now but adding content to the DOM should done in a dom-ready callback?

In getStories.js, data is transformed from an internal REST API, called from one place, to add id properties. This .id property appears unused (ran a basic search, I might've missed it?). If needed, perhaps it could be served in the needed format directly by the REST API to avoid running the extra transformation (and shipping the code for performing that transformation).

That's a relic of the previous code design. We removed most of the use for it by moving the logic from the Viewer component to a store but it looks like it is still being used in a few places. I have filed T312804: Stop assigning story frame id client-side but I don't know when we'll get to it.

As a general tip, If non-Vue code feels complex, it can sometimes be simpler to write HTML as a string literal (with e.g. mw.html to escape as-needed), rather than using jQuery.

I'll look into that.

  1. mw.ext.story.discover.styles

[…]

  • We couldn't find a hook to add exactly below the title. One can be created but it is extra work.

There might be prior art from Wikivoyage that we can leverage, e.g. check https://en.wikivoyage.org/wiki/Belzoni and https://en.m.wikivoyage.org/wiki/Belzoni, and how it modifies the Minerva page header.

I'll look into that.

Alternatively, if it can be below the page action tabs, there are plenty of ways to insert something server-side atop page content. Inserting there might fit better in the overall MediaWiki model, as these stories are associated with the page view, and e.g. would not remain when on the talk page or history action, hence part of the "tab".

I'll bring it up with our designer and product manager.

See also https://www.mediawiki.org/wiki/Help:Page_status_indicators for a similar concept we have already, for small badges associated with the content and displayed atop the page.

  • It would slow down the initial rendering of the article content.

The Wikistories REST API accesses the information from an indexed database table with Memcache in front. This looks good! I believe that would generally respond < 1ms in production. Beyond that there is only string concatenation, which is immeasurable. I expect no significant impact from adding this to a backend page view hook. Even more so with CDN caching in front, if we get to unregistered users.

We expect to launch to all users and remove the beta feature in weeks or months. We'll seriously try to find a way to include them on the server before then.

  • It would require us to invalidate the cached article more often to account for stories changing.

If the story pages are attached to articles with MCR and connected by templatelinks, then this will happen automatically when the Story content model is internally edited. This would also have the benefit of naturally accomodating abuse and quality control by manifesting the addition and removal of such Story-connections with a watchlist/RC/pagehistory directly on the article where it visually appears.

We don't know what the technical approach will be but we are talking about making the stories inclusion explicit on the article pages for the purpose of moderating which stories show where.

Lack of this integration and notification played a major role in the initial (negative) reception of Wikidata on Wikipedia — until such integration was developed.

Alternatively, to keep the connection the other way around as it is now, you'd only need to call Title->invalidateCache() from StoryContentHandler::getSecondaryDataUpdates.

Got it.

So long as invalidations only happen when humans (or bots) make substantial changes to a Story, the CDN purge rate won't be a problem. We already process thousands of purges per second, and the CDN generally doesn't hold on to even popular pages for more than 24 hours, so invalidating often is no problem.

I'll also mention that we have Mustache available for easy templating on the server-side.

[…]

  • The images in question do not show up in "What links here". E.g. on the file page in Beta Cluster, it says this file is not used anywhere, but it is in fact used on the Story page and on the article page that displays the story. Without this, we will likely suffer issues from images being deleted as appearing to be unused, as well as decreased confidence/trust in link APIs being reliable.

The StoryBuilder on the beta cluster is setup to use production Commons because beta commons was just unusable for demo purposes. However, I think you can put filename from beta commons using the no-js edit form and that should work since both prod and beta commons are configure as foreign repos in beta.

  • A step further, we have GlobalUsage which propagates that data across wikis to the central Commons repository for further monitoring and automation, in particular CommonsDelinker and things like "universal replace" which take care of removing or updating thumbnails and items in galleries as-needed after copyright issues are identified.

I was trying to make that work with $parserOutput->addImage( $file ); in StoryContentHandler::fillParserOutput to register file dependencies. If you go to this file: https://commons.wikimedia.org/wiki/File:Jakarta_Cathedral_Afternoon.JPG and scroll down to the usages on idwiki, you'll find a story there.

  • Similarly, the story page itself is also currently reported as orphaned page which with association from the article might be avoided. This also ensures you can't have stories linking to a page that has since been deleted, as deleting a page would then naturally remove its outgoing links. And, if stories may be attached to multiple articles, this mechanism can help the community to identify orphaned/abandoned stories.

I see. We don't have a solution to this at the moment.

[…]
Looking at wgWikistoriesCreateUrl — […]

Perhaps we can avoid the payload in mw.config, though, by bundling the name of that special page and still use mw.Title to format the URL to its subpage.

Sure we can pass in the localized special page name instead of the re-constructed url. how would we specify it if not via mw.config?

With Package files bundling, you can add a virtual data.json payload on the server with a key-value to read with require() on the client. This is similar to how the current value is generated server-side for mw.config, but from a static (cached) load.php callback, rather than an OutputPage hook. Something like:

@ class Hooks {
    public function onBeforePageDisplay( .. ) {
-     $out->addJsConfigVars( [ 'wgWikistoriesCreateUrl' => SpecialPage::getTitleFor( 'StoryBuilder', $title )->getLinkURL() ] );
    }
    public function getDiscoverBundleData( .. ): array {
+     return [ 'storyBuilderPage' => SpecialPage::getTitleFor( 'StoryBuilder' )->getPrefixedText() ];
    }

@ Discover.js
- var storyBuilderPage = mw.config.get( 'wgWikistoriesCreateUrl' );
+ var storyBuilderPage = require( './data.json' ).storyBuilderPage;
  var url = mw.Title.newFromText( storyBuilderPage + '/' + pageName ).getUrl();

Or more minimalistic, by using a pure TitleValue and the unprefixed getText to skip server-side namespace localisation and skip client-side title prefix parsing, as the client already knows to localise namespaces (we only lack the special page name today):

@ Hooks.php
  public function getDiscoverBundleData( .. ): array {
    return [ 'storyBuilder' => SpecialPage::getTitleValueFor( 'StoryBuilder' )->getText() ];
  }

@ Discover.js
  var storyBuilder = require( './data.json' ).storyBuilder;
  var url = mw.Title.newFromText( 'Special:' + storyBuilder + '/' + pageName ).getUrl();
  //idwiki> "/wiki/Istimewa:Story/Some_page_name" 

Package files example at https://www.mediawiki.org/wiki/ResourceLoader/Package_files#Generated_content.

In the same file:

	activity_session_id: mw.eventLog.id.getSessionId(),
	pageview_id: mw.user.getPageviewToken(),
	page_title: mw.config.get( 'wgTitle' )

This is a lot of information to log and may be problematic in terms of privacy. Afaik it's unprecedented at WMF, and unlikely to be needed for a concrete product question, to associate literal article names that people read with an identifier such as a browser session. Especially considering other instrumentions share those same identifiers. Likewise, we probably shouldn't be logging the pageview token in association with the page title, as that kind of defeats the purpose of that token.

While I may be wrong, this gives the impression of "log everything and sort it out later". My understanding of our transparancy and privacy are such that we generally lead with a question first (made public via Phab or in the commit message), prior to shipping instrumentation. This also derisks impact of data leak, and increases chances of statistically correct data use.

I just got back from a three-month sabbatical, so it's a little difficult to remember my full thought process off the top of my head. However, I think I included page title so that we can see whether there are meaningful differences in story performance by quality and topic, and it would be difficult to know those dimensions accurately just by capturing those things up front. This is mainly needed when the project is at an early stage and only deployed at a few wikis, so I anticipate removing it from the instrumentation later on.

However, I think you make a good point and I should revisit this decision. I've filed T312890 for that and will pick that up after Inspiration Week.

This would also have the benefit of naturally accomodating abuse and quality control by manifesting the addition and removal of such Story-connections with a watchlist/RC/pagehistory directly on the article where it visually appears.

Lack of this integration and notification played a major role in the initial (negative) reception of Wikidata on Wikipedia — until such integration was developed.

@SBisson this is a good point—I agree surfacing story changes to watchers of the pages where they're attached is essential for community quality control. I think I saw somewhere that someone at Indonesian Wikipedia had already raised this point.

Change 812931 merged by jenkins-bot:

[mediawiki/extensions/Wikistories@master] Register explicit dependency to ext.eventLogging

https://gerrit.wikimedia.org/r/812931

Change 812926 merged by jenkins-bot:

[mediawiki/extensions/Wikistories@master] Discover: use wgPageName instead of wgTitle

https://gerrit.wikimedia.org/r/812926

Krinkle edited projects, added Performance-Team (Radar); removed Performance-Team.