Page MenuHomePhabricator

Expose page view info to lua
Open, Needs TriagePublicFeature

Description

Feature summary (what you would like to be able to do and where):

  • we should provide a lua function to get page view info for a page
  • the function should probably be considered "expensive"
  • I specifically do not think we should support a magic word T298170

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):

  • Replacing graph extension showing graphs of page views on talk pages with a generic template, e.g. via {{chart}}
  • Anything else users put their mind to (perhaps comparing pages)

Event Timeline

One issue here is that $wgPageViewInfoWikimediaRequestLimit (currently 5) might not make sense, because multiple pages could be rendered in a single request (especially in the job queue).

Change #1199495 had a related patch set uploaded (by SD0001; author: SD0001):

[mediawiki/extensions/PageViewInfo@master] Expose last 30 days page view data to Lua

https://gerrit.wikimedia.org/r/1199495

last 30 days

It will be useful to have option to show pageviews graph for last year, at least.

Change #1199736 had a related patch set uploaded (by SD0001; author: SD0001):

[integration/config@master] Zuul: [mediawiki/extensions/PageViewInfo] Add Scribunto phan dependency

https://gerrit.wikimedia.org/r/1199736

Change #1199736 merged by jenkins-bot:

[integration/config@master] Zuul: [mediawiki/extensions/PageViewInfo] Add Scribunto phan dependency

https://gerrit.wikimedia.org/r/1199736

Why get the parser involved? Why not load from the pageviews API on the client side?

Why get the parser involved? Why not load from the pageviews API on the client side?

So that the data can be used alongside other parser capabilities, like Scribunto's SVG support to generate graphs. It could also be used with Charts, which I now see has a ticket for this use-case: T393500, which would be solved by this patch, since it notes:

Current plan is to remedy this by exposing the page views data to Lua code which can fill out the data set in a transform, either:

-> use ExternalData extension with allow-listing of rest API targets
-> something like ExternalData but tuned to our specific needs

So that the data can be used alongside other parser capabilities, like Scribunto's SVG support to generate graphs. It could also be used with Charts, which I now see has a ticket for this use-case: T393500, which would be solved by this patch

Can't you use Apache ECharts directly? What value is Charts adding? What can you do in Lua that you can't do in JavaScript? How does the value of Charts and/or Scribunto outweigh the performance cost of continually purging the cache of every talk page?

Tim and I had a talk about some possibilities here; the main reason we need access to the data server-side is for pre-rendering the chart to SVG when client-side JS isn't available (disabled, old client, or we're actually in a mobile app). .. Here's what we're recommending together to support this kind of use case without becoming a parser choke point:

  • refactor Charts to indirect the HTML/SVG payload into a REST API endpoint, loaded via an <iframe>
  • now instead of having to invalidate an entire wiki page when the underlying data expires, only that SVG rendering is expired, and it can be re-rendered on demand -- much cheaper
  • the iframe can be quite minimal in its JS payload, loading only Apache ECharts and surrounding render code.
  • the iframe can also be restricted for security for defense in depth

In this scenario, a {{#chart:}} could either take the page view data in via a Lua transform, or we could consider a separate way to reference page view data or similar data from the chart setup JSON. In either case, the actual fetch of data wouldn't have to happen during the wiki page render, it would happen during the chart iframe render.

If folks are also thinking of things like generating (sanitized) SVG directly via Lua, we'd probably want a similar infrastructure for that for the same reasons of cache identity for the item vs the surrounding page. So we might want to think briefly about how to generalize this just enough to get it done right with a little help in core, a little help in parser/content land, and a little help in Reader Growth where we have maintenance of Charts.

I know folks want to rush ahead and get stuff done, but I'd rather we do it right and make it sustainable!