Fri, Nov 8
Thu, Nov 7
@Krinkle T180051 IMHO implies a different solution. That task, as well as speeding up Kibana, would be accomplished with the work intended here. The last comment from @Eevans lines up with the intent of this task.
Wed, Nov 6
If the statsd-exporter sidecar approach is appropriate for ORES, there are quite a few metrics with unclear type and meaning. I've constructed a tree to assist us in defining them.
Wed, Oct 30
Tue, Oct 29
It looks like recaptcha was built in recently and is available in buster https://bugs.launchpad.net/mailman/+bug/1774826
Mon, Oct 28
Fri, Oct 25
Historically, out queue monitoring has been noisy. One idea to have less noisy outbound monitoring is to take the queue depth and estimate how long it will take to send that queue based on the average send time.
@mepps The list has been created and the password emailed to you. You may need to share it with your co-admin(s). The admin interface can be found here: https://lists.wikimedia.org/mailman/admin/sustainability/
The necessary changes have been deployed. Please let me know if you encounter any related issue.
Thu, Oct 24
This issue is mitigated as of this UTC morning and confirm I am no longer seeing long delays of list email.
Wed, Oct 23
Account deployed and checked Grant is in wmf ldap group. Please let me know if you encounter any related issue.
It looks like your email address in wikitech is not updated to your staff address. Would you please correct this then we can proceed?
The place to change it is here: https://wikitech.wikimedia.org/wiki/Special:Preferences
Looks like a short disconnection. Icinga shows all ok now for 7 hours.
This issue is easily replicated requesting pages 7 and 14. It first throws a 500 and then 429.
Tue, Oct 22
I've been monitoring this the past couple days. Since yesterday we've gone from over 20k messages in the queue to less than 6k. The backlog seems to be coming from a particular provider's ratelimiting. Samples from my inbox indicate delays between when Google relays the message to the list, and delays between the list server and the outbound mail relay. The combined effect makes the delay metric you're seeing.
Of interest: all have user agent FortiGate (FortiOS 5.0) and have appeared near simultaneously from a number of sources globally starting 2019/10/21 at 0900 UTC.
Mon, Oct 21
Oct 3 2019
Sep 23 2019
There are a few options to consider.
- Mitigation in MediaWiki -- Force all logging to be UTF-8 compliant.
- Mitigation in the logging pipeline
- Rsyslog with mmutf8fix
- Deploy logstash-filter-mutate plugin >= 3.5.0 (Appears to be backportable into current Logstash version 5.6.15)
- Upgrade Logstash to >= 7.3 (gemfile)
Sep 19 2019
Sep 18 2019
Sep 12 2019
Sep 10 2019
I cannot reproduce this anymore. Resolving.
Sep 3 2019
Aug 30 2019
Aug 26 2019
Aug 13 2019
Aug 9 2019
@cchen is now in the wmf ldap group. Resolving task.
Access to Superset and Turnilo are managed by the 'wmf' LDAP group. Since it is beyond the scope of this task, your new access request can be found here: T230242
Based on what I could find about your position, you may need more access than indicated here. @MarkTraceur could you help?