Page MenuHomePhabricator

Architecture: We should track size of default gadgets loaded on site and present this to users
Open, Needs TriagePublic

Description

Currently anyone with the correct rights can edit [[MediaWiki:Gadgets-definition]] to turn gadgets on for anonymous users. Left unchecked this has led to some performance problems and potential SEO (related to slow loading) on many of our smaller sites leading to T340705.
We should be doing more to guide site admins on when they are introducing performance/SEO problems to the site.

In MediaWiki we've been investing a lot of effort in limiting the amount of CSS and JS we ship to our end users, but any improvements here are meaningless if the same is not happening within gadgets.

I think we should enforce a performance budget for all gadgets that have been marked as "default".

To do this we need

  • Some way to calculate the amount of Kb of default gadgets are shipped to anonymous users
  • A number that represents an upper bound for this code.

Future usages

Once we have such a number this could then be checked and either:

  • Printed on the MEdiaWiki:Gadget-definitions page itself as a warning box (With guidelines of what is a high amount of code and small)
  • Used to send Echo notifications to site admins when
  • Disable all (or some of the) default gadgets when the "budget" has been surpassed.
  • If it's possible to check this quickly on edit, we might refuse to allow edits to the page that introduce gadgets that exceed performance budget.

Event Timeline

If it's possible to check this quickly on edit

I guarantee it's not. Case in point: https://en.wikipedia.org/w/index.php?title=MediaWiki:Gadget-CommentsInLocalTime.js&oldid=872865591

If the amount of gadgets really slows down a project, the community will typically start investigating it themselves. By enforcing some arbitrary budget, you're setting yourself up for a massive community backlash and dumpster fire.

Somehow visualizing the performance impact of gadgets is a great idea (pie chart?? everybody likes pie!), enforcing a budget is not.

If any particular project is actually draining resources in an extremely disproportionate manner you should kindly ask that community to please investigate. If they don't have the know-how or resources to do so, assign a paid developer for a while to help them optimize their code.

This task does not look like a task requesting support from CommRel-Specialists-Support

This task does not look like a task requesting support from CommRel-Specialists-Support

The developers are in for a world of hurt if CommRel doesn't get involved in this.

@Quiddity, what do you think?

The developers are in for a world of hurt if CommRel doesn't get involved in this.

I think you're mischaracterizing this task. It's only to know and provide info on, not to do anything listed in "Future uses". Some of the items in that list likely have some communications desirability and working through with communities, but being able to provide numbers to users seems like a reasonable thing to do without any particularly adverse impact.

The developers are in for a world of hurt if CommRel doesn't get involved in this.

I think you're mischaracterizing this task. It's only to know and provide info on, not to do anything listed in "Future uses". Some of the items in that list likely have some communications desirability and working through with communities, but being able to provide numbers to users seems like a reasonable thing to do without any particularly adverse impact.

The two empty checkboxes/goals for this task are preceded by this text:

I think we should enforce a performance budget for all gadgets that have been marked as "default".
To do this we need

This is above the "future uses" section. Providing these numbers has only one very specific goal: to be able to enforce a performance budget. So don't expect nice pie charts or anything optimized to be human-readable, the only goal here is to provide a singular number to make a budget enforceable. My prediction: once this task is completed, sunk cost will dictate the performance budget must be implemented. Or if this ultimately doesn't happen, whoever wasted their time on the code to provide this singular number will be sour.

But please do tell me where I'm wrong. Providing human-readable data aimed at finding low hanging fruit would be great. Calculating a number to enforce a performance budget, not so much.

[snip]

My opinion is that the only things that decide the completion of this task are the little boxes themselves, neither of which is to enforce the actual limit. Especially with the separate section indicating future effort. I don't see it as sunk cost at all to be able to alert interface admins and others that they have done a bad job (for the unregistered user) of ensuring a minimum of resources ends up at the other end of the series of pipes. As in, I think there is merit alone in knowing that I'm sending a lot of other Stuff to the end reader. (Obviously, this task doesn't care about the registered user, who mostly gets to opt in to whatever downloads he wants.)

But I'm sure Jon can clarify what the point of this task is, and that you probably should wait for him to get to work tomorrow morning to do so. :)

The developers are in for a world of hurt if CommRel doesn't get involved in this.

If an engineering team wanted support from CommRel-Specialists, that would require creating a dedicated subtask by filling out the CommRel-Specialists form for that. Let's not overload this engineering task; adding team project tags won't involve.

The key bit of information in this task description is We should be doing more to guide site admins on when they are introducing performance/SEO problems to the site. (key word "guide" not prevent :-))

To echo what @Izno says above my expectation with this task was to share the idea that it would be useful if a measurement/visualization/guidance was in place as a starting point. I've been talking to various communities relating to T340705 and most communities didn't even know they had a problem and would seemingly benefit from this information being presented to them so they can find solutions (FWIW English Wikipedia does an amazing job at managing performance of its gadgets, but most projects do not have the manpower and expertise that English Wikipedia does).

@AlexisJazz it seems like the idea of a performance "budget" seems threatening to you, so I'd be keen to understand this more as reading into this a little perhaps you've had negative experiences with budgets and I'd like to understand those better. The number in my experience is really just an artifact for a social contract meant to generate conversation and find solutions, rather than impede things. FWIW WMF does have a basic performance budget in place for skins, but it is very forgiving and can be redefined within reason at any time. For example, we have been currently running over our budget for 2 weeks now due to a recent deployment and that has prompted some really helpful discussion: T345414.

I should note, for now this is just an idea. The gadgets extension currently does not have any maintainers or a strategy (T171577) so implementation (which would not be trivial) very much depends on a team at WMF taking this on and thinking more deeply about our strategy here and I'm just trying to capture some of the issues we currently have in this ecosystem (T344062 being another).

The key bit of information in this task description is We should be doing more to guide site admins on when they are introducing performance/SEO problems to the site. (key word "guide" not prevent :-))

To me, the task description as a whole very much reads like "we must police this wild west", not "we should provide community with the data they need to make informed decisions" in my view. It's talking about refusing edits by interface admins and randomly disabling gadgets!

To echo what @Izno says above my expectation with this task was to share the idea that it would be useful if a measurement/visualization/guidance was in place as a starting point. I've been talking to various communities relating to T340705 and most communities didn't even know they had a problem and would seemingly benefit from this information being presented to them so they can find solutions.

Knowledge is useful, we agree on that. However, thought ought to be given to the best approach. If sometime in the future it is concluded that another approach is better, for example a service on ToolForge that provides pie charts and the like, the work for this task to generate a raw number could become obsolete.

Personally I don't see a singular raw number that is aimed at enabling automated warnings and/or sanctions as a great help. And given how (even if it's warn-only at first) it could evolve into something that ends up sabotaging communities I see much potential for a shining example of "the road to hell is paved with good intentions"

I actually do believe that for a commercial enterprise that provides wiki hosting this might be a useful tool. They can't collaborate with their customers and poor SEO may actually impact them. I could see how this might benefit for example Fandom.

While Wikimedia and Fandom share a founder, they are not the same. On Wikimedia, sanctions are a last resort. Informing and collaborating are the tools of choice.

As such, I see considerable potential value in providing detailed human-readable stats on ToolForge. Those could also be linked in an edit notice on gadgets-definition. Or even an abusefilter-warning (but no deny) whenever someone adds a default gadget, to make interface admins aware that there's a cost.

@AlexisJazz it seems like the idea of a performance "budget" seems threatening to you, so I'd be keen to understand this more as reading into this a little perhaps you've had negative experiences with budgets and I'd like to understand those better.

Part of it is the power imbalance. It sometimes feels like the WMF doesn't think about the impact of something on the community. The WMF can legally do whatever it wants, but in the end the WMF and the communities need each other.

And I'll be darned if I don't at least say something. The only thing necessary for evil to win is for good men to do nothing. (and that includes evil with good intentions)

And specifically on performance budgets, there's actually an example of that as well. Rate limit is at 90 edits per minute. No doubt the developers had the best of intentions, but to cite Fæ:

At this point nobody seems to actually understand the impact of this untested global major change that had zero consultation with the community and was done in secret discussions for secret reasons by unknown participants

At least this task isn't secret, but it's not well advertised either..

@AlexisJazz: Please stay on-topic. Random past board or ban stuff has nothing to do with guiding site admins about technical issues. This is not a forum. Thanks!

@AlexisJazz: Please stay on-topic. Random past board or ban stuff has nothing to do with guiding site admins about technical issues. This is not a forum. Thanks!

If you think my examples are "random" that's too bad. They weren't. I'd explain why, but don't count on that being appreciated.

@Jdlrobson, would you consider T346115 to be a possible alternative/substitute for this task for Wikimedia? Or do you think it wouldn't be enough? One advantage it has is that it could be done without gadgets extension maintainers, so maybe it could happen sooner.

@AlexisJazz I can't do much about any perceived opinion of the WMF so agree with AKlapper that it seems unrelated. I was more interesting in keeping discussion technical and discussing experiences with performance budgets rather than get sucked into a conversation about WMF/volunteer relationship. I'm well aware there are challenges here and am trying to do my bit to help change that. Performance budgets can be controversial too and I know people who've had bad experiences with poorly defined ones so if we did go down this route it's important we define it correctly.

I think T346115 is a great idea, and sounds like it would be very useful but my worry would be around discoverability (is every project expected to know about it and update interface messages to link to it? There are tradeoffs between having budget guidelines as a core part of the software and as a separate tool to be had. I don't see any reason why the two tasks can't coexist. I'll comment on your ticket.

Small comment: if this becomes a thing, this should also become a thing for WMF developers, since (from the looks of it for me) most of the JS payload on Wikimedia websites comes not from gadgets, but from various default JS modules loaded without any possibility to opt-out. Focusing on gadgets then becomes trying to push the blame onto the gadget developers even though they are not the main contributors to JS payload.

Small comment: if this becomes a thing, this should also become a thing for WMF developers, since (from the looks of it for me) most of the JS payload on Wikimedia websites comes not from gadgets, but from various default JS modules loaded without any possibility to opt-out. Focusing on gadgets then becomes trying to push the blame onto the gadget developers even though they are not the main contributors to JS payload.

It's already a thing for WMF developers: https://phabricator.wikimedia.org/T360590

It's already a thing for WMF developers: https://phabricator.wikimedia.org/T360590

Are there any metrics showing what contributes the most to the JS payload? I tried to look into that issue in Russian Wikipedia as a result of the T340705: [performance budgeting] Improve JS payload for projects with gadgets that lead to a 30%+ increase after gzip task, but we ended up with an impression that default gadgets contribute far less into the JS size than the default modules did.