Page MenuHomePhabricator

CentralNotice should catch banner errors and log them to its own channel
Open, HighPublicFeature

Description

Feature summary (what you would like to be able to do and where):
Frequently the web team often gets alerts for errors in banners related to newly deployed banners. To save the web team unnecessarily investigating these error spikes it would be helpful for banner errors to be logged into a separate channel.

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):

  • I received an email alert today.
  • I dropped everything to investigate
  • From several stack traces I realized it was a banner.

Benefits (why should this be implemented?):

  • It's low effort to do this
  • The web team doesn't unnecessarily get notified about potential errors in software
  • The fundraising tech team does not get pinged about banners that do not relate to fundraising campaigns.
  • The fundraising team can monitor errors themselves without needing web team.

Event Timeline

Change #1016759 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[mediawiki/extensions/CentralNotice@master] Log central notice banner errors to their own channel

https://gerrit.wikimedia.org/r/1016759

@Pcoombe looks like a duplicate to me. Feel free to merge the two descriptions as you see fit. Patch above shows what this entails.

Is it possible to log the banner name as well? That would be really helpful

Is it possible to log the banner name as well? That would be really helpful

Yes. I replied on the Gerrit patch with details about how - but you'll need to do this on my behalf :)

Jdlrobson added a subscriber: AKanji-WMF.

@AKanji-WMF could somebody in fundraising tech please review the associated patch? I see you moved it into "Blocked or not fr-tech" but I am waiting for fundraising tech to guide on next steps.

It's important for me that we get this fixed to clarify team responsibilities around what to do with broken banners.

@AKanji-WMF I would be more comfortable with someone from Fundraising Tech reviewing the actual patch, since I don't tend to touch Centralnotice internals and know nothing about our error logging

@Jdlrobson You say banner errors would go to "their own channel" - where is that, how would we view them? Would it just be a page in logstash.wikimedia.org?

@Jdlrobson You say banner errors would go to "their own channel" - where is that, how would we view them? Would it just be a page in logstash.wikimedia.org?

Once the patch is merged and deployed they will automatically show up in logstash in https://logstash.wikimedia.org/app/dashboards#/view/AXDBY8Qhh3Uj6x1zCF56?_g=h@218e8f9&_a=h@8f07421 (query error_context.component:"error.centralnotice") - you would then be able to create your own dashboards based on the data logged there without any additional setup.

I am talking to Sam about this on Thursday - to work out the correct technical approach here.