User Details
- User Since
- Nov 22 2021, 10:00 PM (228 w, 6 d)
- Availability
- Available
- LDAP User
- JHathaway
- MediaWiki User
- JHathaway (WMF) [ Global Accounts ]
Fri, Apr 10
@Peachey88 awesome, thanks
Wed, Mar 25
instances have been deleted.
Fri, Mar 20
added, a few alternatives, after discussing with other folks, happy to hear of others.
Tue, Mar 17
Mon, Mar 16
Mar 13 2026
Mar 12 2026
Mar 10 2026
Okay, the posting flow is a bit different than I understood it at first glance. A user can only use the web form if they are logged in with an account. Otherwise the buttons appear, but they are mailto links which open the users email client.
Mar 9 2026
Ok we have two options now:
Mar 6 2026
if /message/new is the correct route, here is the count of usage from 03-05:
@bd808 based on the User-Agent, User-Agent: HyperKitty on https://lists.wikimedia.org/, and looking in the logs, this message appears to have been posted from the web UI. Messages posted from the web UI are sent via SMTP to exim4, but the source IP is localhost, so our exim4 config skips spam checking. As briefly discussed, I think the only users of the web UI for posting are spammers, https://gitlab.com/mailman/hyperkitty/-/issues/264, so let's disable it.
great, thanks again for bringing this to my attention @Xaosflux
Mar 5 2026
@Xaosflux I have added a DKIM key and tested that Microsoft is able to verify the key correctly. If you could resend the failed email and confirm that it is sent successfully, that would be appreciated.
sounds good, please let me know if you need help in any way
Mar 4 2026
@Mbch331 thanks for reporting this issue. What is the expected mail flow? Does appel.wikimedia.nl originate the emails?
@Xaosflux thanks for reporting this issue. I did some tests with the info-en queue, but I wasn't able to reproduce the issue.
Feb 27 2026
Feb 26 2026
@DamianZaremba I tried with a couple of my test accounts, but I was unable to duplicate your results. The list manager confirmation emails all passed dkim. Would it be possible to forward me the entire email, ideally as an eml attachment? jhathaway@wikimedia.org
Feb 24 2026
@Elli though this is technically possible, outside of some existing corner cases, I am not sure the added confusion of having multiple domains is worth the occasional bounced email.
Feb 23 2026
We only run rspamd in combination with postfix now
@jcrespo is this still ongoing, or are you okay with closing?
Thanks @taavi
Jan 24 2026
great, I'll resolve then, let me know if new problems arise.
Jan 23 2026
@Nardog Microsoft appears to have resolved the issue, are you receiving ticket updates now?
Jan 22 2026
@Nardog this appears to be an issue with Microsoft's email platform, they are throttling our mail server. I have opened a ticket with the support desk to try and get the issue resolved.
@Nardog we changed our DMARC policy on wikimedia.org to quarantine on 2026-01-20, so that is probably the cause of the deliverability issue. I'll look into why the emails are getting rejected and revert the change if necessary.
Jan 21 2026
Strangely I re-imaged both servers from cumin2002 and ran into no issues. Perhaps when you ran the first re-images @MatthewVernon, though they failed, they setup the conditions for the following re-images to succeed? Was there any interesting output from the move-vlan cookbook? I ran the following commands:
@MatthewVernon looking...
Jan 9 2026
Jan 8 2026
Dec 15 2025
Dec 8 2025
Dec 1 2025
@JKelsoteel-WMF the addresses no-reply or noreply are used to indicate that the sender does not expect replies to be sent to that address, and any replies will be discarded. Why is using no-reply@wikimedia.org necessary for this use case?
Nov 17 2025
Nov 5 2025
Nov 4 2025
I'm not sure how to remedy this issue. I see we switched to StaticDB in T355979, perhaps we need to rebuild the StaticDB index?
@Krd we are still receiving bounces for that user as their email rate is still too high. Do they need to subscribe to the 77 remaining queues? Could we perhaps unsubscribe them from all, and pop them a note to resubscribe?
Is this still occurring?
After some analysis today, I think the cause of the bounces were as follows:
@Xaosflux the outbound queue has now been cleared of all backscatter bounce emails, so delivery times should be back to normal.
Nov 3 2025
@Xaosflux I assume it is related, but I have not been able to confirm it yet.
@Krd I see the junk mail queue is now at 600k, how can I help clear it out, I saw some of the scheduled jobs were run, but that does not seem to be enough. Also feel free to contact me on IRC for some real time triaging.
Oct 28 2025
@Krd how else can I help?
@Krd thanks, I'm investigating, not sure of the cause either.
Oct 20 2025
From a brief look, most of these conntrack entries are from an-coord1003.eqiad.wmnet, along with log entries of the form:
Oct 17 2025
debug1: Remote protocol version 2.0, remote software version GerritCodeReview_3.10.6 (APACHE-SSHD-2.12.0) debug2: peer server KEXINIT proposal debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,curve448-sha512,ecdh-sha2-nistp521,ecdh-sha2-nistp384,ecdh-sha2-nistp256,diffie-hellman-group-exchange-sha256,diffie-hellman-group18-sha512,diffie-hellman-group17-sha512,diffie-hellman-group16-sha512,diffie-hellman-group15-sha512,diffie-hellman-group14-sha256,ext-info-s,kex-strict-s-v00@openssh.com
deploy2002 is running bullseye, which has ssh 1:8.4p1-5+deb11u5, so it does not have any of the post quantum algorithms that were first added in 9.0.
Oct 6 2025
Oct 3 2025
@jcrespo it took me a bit of time to coerce the box back into bios mode. I then tried reimaging with bookworm, but the raid step failed, due to the existence of the raid6 volume. After trying a couple of efforts, which failed, I booted off a rescue image and removed the raid6 volume with storcli.
Oct 2 2025
As @jcrespo pointed out on IRC, there is also a quite a bit of puppet 5 documentation which needs to be removed or updated as part of this task.
Oct 1 2025
- Working hacks!
Sep 29 2025
Thanks @CDanis I happened upon that post as well, I don't think their approach is unreasonable. I think there are different trade offs between complexity and adherence to spec. My preference is to try the sync script, but if that fails I'm happy to look at their approach.
Sep 25 2025
Sep 18 2025
Would it be acceptable to store the data from the parsed DMARC reports in OpenSearch? My initial estimate is that it would require about 50MiB of data per day.
Sep 17 2025
Sep 15 2025
Sep 9 2025
I was able to reproduce on dse-k8s-worker1014 by flipping VT on that re-running the cookbook, how does this patch look, https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1186619, it tests for me okay on dse-k8s-worker1014.
because of the error on line 22?