Page MenuHomePhabricator

bd808 (Bryan Davis)
Principal Software EngineerAdministrator

User Details

User Since
Oct 3 2014, 2:36 PM (601 w, 1 h)
Roles
Administrator
Availability
Busy Busy at E1967: No Fridays in FY26Q4 until Apr 12.
IRC Nick
bd808
LDAP User
BryanDavis
MediaWiki User
BDavis (WMF) [ Global Accounts ]

I'm BDavis (WMF) on wiki, bd808 on irc & GitLab, and BryanDavis on Gerrit & Wikitech.

I've got a thing for 🦄s. Don't judge.

I work for or provide services to the Wikimedia Foundation, but this is my only Phabricator account. Edits, statements, or other contributions made from this account are my own, and may not reflect the views of the Foundation.

Recent Activity

Wed, Apr 8

bd808 added a project to T422734: gitlab-webhooks build fails on "The runtime.txt file isn't supported": Toolforge.

Adding the Toolforge tag in case someone searches for this general problem only within that tag.

Wed, Apr 8, 10:13 PM · cloud-services-team, Toolforge, GitLab (Integrations), Release-Engineering-Team, User-brennen
bd808 added a comment to T422722: Use php strict types in wikimedia maintained php libraries.

Skipped libraries are:

  • php-session-serializer, maybe obsolete, but reference not found
Wed, Apr 8, 9:39 PM · Patch-For-Review, WMF-General-or-Unknown, PHP-strong-typing, Librarization
bd808 added a comment to T405262: Support pre-built images on components-api.

It would be quite useful to be able to execute these images via components and not have to execute them on NFS workers (which appear to have the worst availability).

Wed, Apr 8, 8:47 PM · tools-platform-team, cloud-services-team (FY2025/2026-Q3-Q4), Toolforge

Tue, Apr 7

bd808 updated subscribers of T422555: [EPIC] Create Automated Supervision test automation for CI.

Is there a tag that can be added to link the two together so we can keep aligned as work progresses?

Tue, Apr 7, 8:47 PM · Epic, Release-Engineering-Team (Radar), QS-Test-Automation
bd808 added a comment to T422555: [EPIC] Create Automated Supervision test automation for CI.

The Pretrain project and follow up work to increase deployment frequency that we are planning for FY27 (July 2026-June 2027) would really benefit from critical user journey tests that can be targeted at "canary" servers during a scap deployment cycle in the WikiKube Kubernetes cluster. The general intent would be to validate that a newly prepared MediaWiki deployment is meeting real-time service level objectives when exercising the code via critical user journey tests as a gate in the deployment automation. If the SLO check succeeds we would move forward with promoting the image to handle all traffic for the active deployment. If the check fails then we would rollback completely to the pre-deployment state and raise a signal for investigation of the SLO check failure.

Tue, Apr 7, 8:35 PM · Epic, Release-Engineering-Team (Radar), QS-Test-Automation
bd808 added a parent task for T422234: Remove the "blubber" label from Jenkins agents: T422507: Wikifeeds CI is not working.
Tue, Apr 7, 4:18 PM · Release-Engineering-Team, Release Pipeline (Blubber), Continuous-Integration-Infrastructure
bd808 added a subtask for T422507: Wikifeeds CI is not working: T422234: Remove the "blubber" label from Jenkins agents.
Tue, Apr 7, 4:18 PM · Essential-Work, Release-Engineering-Team (Priority Backlog 📥), Wikifeeds
bd808 added a comment to T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path.

Permission problem?

If that question is about [step-export] 2026-04-03T00:02:55.903753996Z 2026/04/03 00:02:55 warning: unsuccessful cred copy: ".docker" from "/tekton/creds" to "/tekton/home": unable to open destination: open /tekton/home/.docker/config.json: permission denied, the answer is that that error is benign. @dcaro looked into it a bit in T347401: [lima-kilo,builds] Fix local development setup. We probably should have a FAQ somewhere to point to about it.

Tue, Apr 7, 4:06 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 added a comment to T399415: Unable to generate family for wpbeta:zh with github action (ClientError: (403) Request forbidden).

This class of errors likely all got worse when I blocked all of Microsoft's AS8075 network ranges on 2026-04-03 (T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies).

Tue, Apr 7, 3:39 PM · Beta-Cluster-Infrastructure, Pywikibot-tests, Pywikibot

Mon, Apr 6

bd808 added a comment to T421941: SUL session and password issues for AMastilovic-WMF.

I think @bd808 is on to something here.

Mon, Apr 6, 11:32 PM · WMF-General-or-Unknown
bd808 added subtasks for T380127: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble: T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path, T422384: [builds-builder] incompatibility of fagiani/apt and builder stack "24", T387141: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack.
Mon, Apr 6, 10:56 PM · Toolforge, tools-platform-team, cloud-services-team (FY2025/2026-Q3-Q4), Patch-For-Review
bd808 added a parent task for T422384: [builds-builder] incompatibility of fagiani/apt and builder stack "24": T380127: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble.
Mon, Apr 6, 10:56 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 added a parent task for T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path: T380127: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble.
Mon, Apr 6, 10:56 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 added a parent task for T387141: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack: T380127: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble.
Mon, Apr 6, 10:56 PM · cloud-services-team, Toolforge
bd808 renamed T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path from Building/Running dotnet job fails on Toolforge to Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path.
Mon, Apr 6, 10:54 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 added a comment to T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path.

This always worked before and is currently working on the runs that have not been rebuilt - there must have been a recent change.

Mon, Apr 6, 10:54 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 renamed T422384: [builds-builder] incompatibility of fagiani/apt and builder stack "24" from Buildservice for Rust fails to Buildservice for Rust fails due to fagiani/apt and builder stack "24" mismatch.
Mon, Apr 6, 10:45 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 updated subscribers of T422384: [builds-builder] incompatibility of fagiani/apt and builder stack "24".

It looks to me like the failure is:

[step-build] 2026-04-06T14:04:59.855343648Z -----> Fetching .debs for php
[step-build] 2026-04-06T14:04:59.987474659Z ERROR: failed to build: exit status 1

My guess is that the Buildpack environment version upgrade has caused this problem. As I reported in T394466: [build-service] remove legacy fagiani/apt 0.2.5 builder from `--use-latest-versions` stack the fagiani/apt builder dies with this message when applied within the new "24" builder stack. I had suggested in T394466 that the legacy fagiani/apt builder be removed. It seems that did not happen before or after my bug report was closed as a duplicate of @fnegri's T387141: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack.

Mon, Apr 6, 10:44 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 added a comment to T387141: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack.

T422384: [builds-builder] incompatibility of fagiani/apt and builder stack "24" is reporting the bug I filed as T394466: [build-service] remove legacy fagiani/apt 0.2.5 builder from `--use-latest-versions` stack which was merged into this task but not actually fixed before the Buildpack environment version upgrade happened.

Mon, Apr 6, 10:42 PM · cloud-services-team, Toolforge
bd808 added a comment to T387141: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack.

I think this happened in T380127: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble

Mon, Apr 6, 10:36 PM · cloud-services-team, Toolforge
bd808 added a comment to T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path.

2026-04-03T00:03:52Z [autoreport3-2wsls] [job] : line 1: heroku_output/AutoReport: No such file or directory

Mon, Apr 6, 10:15 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 updated subscribers of T422224: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path.

Permission problem?

Mon, Apr 6, 9:58 PM · tools-platform-team, cloud-services-team, Toolforge

Fri, Apr 3

bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

Let's block all of AS8075 Microsoft Corporation. A number of ranges from this ASN are currently showing as the most active /24 networks. I will be using the list of networks from https://github.com/ipverse/as-ip-blocks/blob/ba0a0415a3d35c44367fd347f5106f7dc101ed5f/as/8075/ipv4-aggregated.txt to do this.

Fri, Apr 3, 9:02 PM · Beta-Cluster-Infrastructure
bd808 created T422271: Implement dependency cooldown logic (minimum age for new versions) where possible.
Fri, Apr 3, 6:07 PM · LibUp
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

We are still seeing quite a bit of what is likely residential proxy traffic with 12,699 distinct /24 networks each sending 2 requests in the last ~17.5 hours (with 23,607 distinct /24 networks seen in total).

Fri, Apr 3, 5:39 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T422105: Can't login to gitlab.wikimedia.org.

@MBH Your developer account is https://ldap.toolforge.org/user/mbh. You have the "shell name" mbh and the "username" Maxbiohazard. When you go to https://gitlab.wikimedia.org/users/sign_in?redirect_to_referer=yes to login to gitlab.wikimedia.org you should be redirected to https://idp.wikimedia.org/login where you login with the Developer account username Maxbiohazard and the same password you use at https://toolsadmin.wikimedia.org (this is the same Developer account).

Fri, Apr 3, 3:24 PM · GitLab (Account Approval)

Wed, Apr 1

bd808 added a comment to T421929: `toolforge jobs logs` misplaces my logs.

If that's the case, then the problem is a different one :), please share if you find an instance of that, according to the post time in phabricator I'm not certain that the command was run in the 1 hour window (could be though, I'm not certain it was not either), do you remember if you did it in that window?

Wed, Apr 1, 11:36 PM · tools-platform-team, cloud-services-team, Toolforge
bd808 added a comment to T130748: Add Content-Security-Policy header enforcing 3rd party web interaction restrictions to proxy responses.

Is it possible to evaluate a gradual introduction of this Content-Security-Policy for example just for new tools?

Wed, Apr 1, 9:10 PM · cloud-services-team, Toolforge, Privacy Engineering
bd808 added a comment to T421929: `toolforge jobs logs` misplaces my logs.

The issue here is probably that we are getting by default only the last 1h of logs (with follow we were not limiting it before, now we are setting it to the same value as the regular).

Wed, Apr 1, 5:25 PM · tools-platform-team, cloud-services-team, Toolforge

Tue, Mar 31

bd808 renamed T421941: SUL session and password issues for AMastilovic-WMF from SUL (WikiTech) account password issues to SUL session and password issues for AMastilovic-WMF.
Tue, Mar 31, 11:03 PM · WMF-General-or-Unknown
bd808 updated the task description for T421941: SUL session and password issues for AMastilovic-WMF.
Tue, Mar 31, 11:01 PM · WMF-General-or-Unknown
bd808 added a comment to T421941: SUL session and password issues for AMastilovic-WMF.

New data point from a slack conversation: It looks to me like it may not have been clear to @amastilovic that their officewiki account and their SUL account share a username, but are in fact completely separate accounts with separate password storage. This doesn't make anything about the abrupt session loss different, but it seems like it could explain the troubles they have had when attempting to log back in.

Tue, Mar 31, 11:00 PM · WMF-General-or-Unknown
bd808 added a comment to T421929: `toolforge jobs logs` misplaces my logs.
tools.link-dispenser@tools-bastion-14:~$ toolforge jobs show crawljob
+---------------+-----------------------------------------------------------------+
| Job name:     | crawljob                                                        |
+---------------+-----------------------------------------------------------------+
| Command:      | crawljob                                                        |
+---------------+-----------------------------------------------------------------+
| Job type:     | continuous                                                      |
+---------------+-----------------------------------------------------------------+
| Image:        | tool-link-dispenser/tool-link-dispenser:latest                  |
+---------------+-----------------------------------------------------------------+
| Port:         | none                                                            |
+---------------+-----------------------------------------------------------------+
| File log:     | no                                                              |
+---------------+-----------------------------------------------------------------+
| Output log:   |                                                                 |
+---------------+-----------------------------------------------------------------+
| Error log:    |                                                                 |
+---------------+-----------------------------------------------------------------+
| Emails:       | all                                                             |
+---------------+-----------------------------------------------------------------+
| Resources:    | mem: 3.0Gi, cpu: 2.0                                            |
+---------------+-----------------------------------------------------------------+
| Replicas:     | 1                                                               |
+---------------+-----------------------------------------------------------------+
| Mounts:       | none                                                            |
+---------------+-----------------------------------------------------------------+
| Retry:        | no                                                              |
+---------------+-----------------------------------------------------------------+
| Timeout:      | no                                                              |
+---------------+-----------------------------------------------------------------+
| Health check: | script: launcher ping                                           |
+---------------+-----------------------------------------------------------------+
| Status:       | Running                                                         |
+---------------+-----------------------------------------------------------------+
| Hints:        | Last run at 2026-03-20T05:32:11Z. Pod in 'Running' phase. State |
|               | 'running'. Started at '2026-03-20T05:32:20Z'.                   |
+---------------+-----------------------------------------------------------------+
tools.link-dispenser@tools-bastion-14:~$ toolforge jobs logs crawljob
ERROR: Job 'crawljob' does not have any logs available
tools.link-dispenser@tools-bastion-14:~$ kubectl get po
NAME                              READY   STATUS    RESTARTS   AGE
crawljob-6cd85dbdfb-kqwq8         1/1     Running   0          11d
link-dispenser-86cf4d9c96-fqj6s   1/1     Running   0          11d
redis-55c68578f-v5k72             1/1     Running   0          12d
tools.link-dispenser@tools-bastion-14:~$ kubectl logs crawljob-6cd85dbdfb-kqwq8
/layers/heroku_python/dependencies/lib/python3.12/site-packages/celery/platforms.py:799: SecurityWarning: An entry for the specified gid or egid was not found.
We're assuming this is a potential security issue.
...
[2026-03-31 13:52:04,233: INFO/ForkPoolWorker-12] Task jobs.crawl_page[e903c0dc-0557-415e-9e36-63e3705b63bb] succeeded in 18.486338403075933s: None

There are logs going to stdout/stderr in the container and being seen by Kubernetes, but it looks like they are not being collected by Loki.

Tue, Mar 31, 4:25 PM · tools-platform-team, cloud-services-team, Toolforge

Mon, Mar 30

bd808 is attending E1976: No Fridays in FY26Q4.
Mon, Mar 30, 11:37 PM
bd808 declined E1975: No Fridays in FY26Q4.
Mon, Mar 30, 11:37 PM
bd808 declined E1974: No Fridays in FY26Q4.
Mon, Mar 30, 11:36 PM
bd808 is attending E1973: No Fridays in FY26Q4.
Mon, Mar 30, 11:36 PM
bd808 is attending E1972: No Fridays in FY26Q4.
Mon, Mar 30, 11:36 PM
bd808 is attending E1971: No Fridays in FY26Q4.
Mon, Mar 30, 11:36 PM
bd808 is attending E1970: No Fridays in FY26Q4.
Mon, Mar 30, 11:36 PM
bd808 is attending E1969: No Fridays in FY26Q4.
Mon, Mar 30, 11:35 PM
bd808 is attending E1966: No Fridays in FY26Q4.
Mon, Mar 30, 11:35 PM
bd808 is attending E1968: No Fridays in FY26Q4.
Mon, Mar 30, 11:35 PM
bd808 is attending E1967: No Fridays in FY26Q4.
Mon, Mar 30, 11:34 PM
bd808 declined E1965: No Fridays in FY26Q4.
Mon, Mar 30, 11:33 PM
bd808 set E1964: No Fridays in FY26Q4 to repeat weekly.
Mon, Mar 30, 11:31 PM · events
bd808 created E1964: No Fridays in FY26Q4.
Mon, Mar 30, 11:31 PM · events
bd808 created E1961: Wikimedia Hackathon 2026 + travel.
Mon, Mar 30, 10:30 PM · Wikimania-Hackathon-2026, events
bd808 created E1960: Sabbatical.
Mon, Mar 30, 10:27 PM · events
bd808 closed T421654: Use HTTP Cookie to legitimate BETA cluster access as Declined.

I agree that IP range blocks are a crude implement. I don't see any viable use of an undocumented cookie however as a partial workaround.

Mon, Mar 30, 9:43 PM · Beta-Cluster-Infrastructure
bd808 updated the task description for T421713: Build new Speechoid MVP.
Mon, Mar 30, 5:33 PM · User-Viktoria_Hillerud_WMSE, Wikispeech-Jobrunner (Sprint), Wikispeech-Text-to-Speech, Wikispeech-WMSE
bd808 added a comment to T386559: X-spam-score header missing on obvious spam delivered to multiple Mailman3 lists via HyperKitty web ui.

I'd say let's just disable it :D

Mon, Mar 30, 5:18 PM · Patch-For-Review, collaboration-services, SRE, Wikimedia-Mailing-lists

Fri, Mar 27

bd808 added a comment to T420299: Maps don't display on the beta cluster; tile server offline on deployment-maps-master02.deployment-prep.eqiad1.wikimedia.cloud.

I can cleanup the stale instances.

Fri, Mar 27, 3:39 PM · Essential-Work, Content-Transform-Team (Work In Progress), Beta-Cluster-Infrastructure
bd808 added a comment to T226688: Block web crawlers from accessing Cloud Services.

But the real solution should be elsewhere, at a level with a visible IP-address.

Fri, Mar 27, 3:07 PM · Cloud-VPS, cloud-services-team, Toolforge
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.
Fri, Mar 27, 2:59 PM · Beta-Cluster-Infrastructure
bd808 updated subscribers of T421345: QuickStatements not working since a couple of days (Item not available).

Adding @Magnus as a subscriber per the instructions at https://www.wikidata.org/wiki/Help:QuickStatements#FAQ

Fri, Mar 27, 1:12 AM · Tools
bd808 renamed T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies from Beta cluster is slow as sludge / serves 503 and 504 to Beta cluster is being crawled to death by bot traffic coming from residential proxies.
Fri, Mar 27, 12:40 AM · Beta-Cluster-Infrastructure
bd808 triaged T421334: Wikibugs should ignore V+1 from Cindy The Browser Test Bot as Medium priority.

https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/blob/c95720bd6ec4dd9de94aa2072343ca62a0f807e8/src/wikibugs2/gerrit.py#L36 would be the place to implement this by adding the bot's name to IGNORED_POSITIVE_VOTES.

IGNORED_POSITIVE_VOTES = ["jenkins-bot", "PipelineBot", "SonarQube Bot"]
Fri, Mar 27, 12:38 AM · Wikibugs

Thu, Mar 26

bd808 moved T421449: Clean up broken legacy maps stack in Beta Cluster from To Triage to Future on the Beta-Cluster-Infrastructure board.
Thu, Mar 26, 11:56 PM · Beta-Cluster-Infrastructure
bd808 added a subtask for T401839: Migrate deployment-prep away from Debian Bullseye to Bookworm/Trixie: T421449: Clean up broken legacy maps stack in Beta Cluster.
Thu, Mar 26, 11:56 PM · Epic, Release-Engineering-Team (Priority Backlog 📥), Cloud-VPS (Debian Bullseye Deprecation), Beta-Cluster-Infrastructure
bd808 added a parent task for T421449: Clean up broken legacy maps stack in Beta Cluster: T401839: Migrate deployment-prep away from Debian Bullseye to Bookworm/Trixie.
Thu, Mar 26, 11:56 PM · Beta-Cluster-Infrastructure
bd808 updated the task description for T421449: Clean up broken legacy maps stack in Beta Cluster.
Thu, Mar 26, 11:54 PM · Beta-Cluster-Infrastructure
bd808 created T421449: Clean up broken legacy maps stack in Beta Cluster.
Thu, Mar 26, 11:53 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

Another unblock request:

root@deployment-mediawiki14:~# ./subtractNetworks.py 82.0.0.0/8 82.64.0.0/14
abuse_networks:
  blocked_nets:
    networks:
    - 82.0.0.0/10
    - 82.68.0.0/14
    - 82.72.0.0/13
    - 82.80.0.0/12
    - 82.96.0.0/11
    - 82.128.0.0/9

https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/a99b8763804b48dbeea6e0ded874282f898da321%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index 7ee6a7d..965a63b 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Thu, Mar 26, 10:23 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

81.0.0.0/8 caught some folks who PM'd me. I removed that block since it was part of the "close the gate after the horses left" wild blocks anyway.
https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/4fefbcd424cb77043c209c07026b8feb13e1bfeb%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index 2024a41..7ee6a7d 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Thu, Mar 26, 9:53 PM · Beta-Cluster-Infrastructure
bd808 moved T383514: Define how vanishing requests are processed on Wikimedia beta cluster from To Triage to Backlog on the Beta-Cluster-Infrastructure board.
Thu, Mar 26, 9:47 PM · Account-Vanishing, Beta-Cluster-Infrastructure
bd808 added a comment to T383514: Define how vanishing requests are processed on Wikimedia beta cluster.

Where would you expect to find the documentation you are hoping to see added?

My suggestion would be:

  1. In the UI. Maybe https://wikitech.wikimedia.org/wiki/MediaWiki:Globalvanishrequest-pretext ?
Thu, Mar 26, 9:46 PM · Account-Vanishing, Beta-Cluster-Infrastructure
bd808 moved T420068: Delete orphaned host-specific and unused prefix hiera settings to reduce confusion about valid and active config from To Triage to Future on the Beta-Cluster-Infrastructure board.
Thu, Mar 26, 9:29 PM · Beta-Cluster-Infrastructure
bd808 moved T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies from To Triage to Backlog on the Beta-Cluster-Infrastructure board.
Thu, Mar 26, 9:28 PM · Beta-Cluster-Infrastructure
bd808 closed T420642: Project deployment-prep instance deployment-cirrussearch12 is down as Invalid.
Thu, Mar 26, 9:28 PM · Beta-Cluster-Infrastructure
bd808 closed T420299: Maps don't display on the beta cluster; tile server offline on deployment-maps-master02.deployment-prep.eqiad1.wikimedia.cloud as Resolved.

Thanks for the help figuring out what to do @Jgiannelos and @MSantos. And thanks for making the patch and getting it deployed @ABreault-WMF and @cscott. It really does take a village to keep Beta working. :)

Thu, Mar 26, 9:20 PM · Essential-Work, Content-Transform-Team (Work In Progress), Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

Throw a big pile of wide blocks and hope it works for a bit...

root@deployment-mediawiki14:~# ./big-ban-hammer.sh 10000 +1 | sort -n -k 4
    - 78.0.0.0/8        # 10072 hits
    - 2.0.0.0/8         # 10702 hits
    - 76.0.0.0/8        # 11241 hits
    - 93.0.0.0/8        # 11314 hits
    - 31.0.0.0/8        # 11415 hits
    - 109.0.0.0/8       # 11994 hits
    - 81.0.0.0/8        # 12270 hits
    - 24.0.0.0/8        # 12360 hits
    - 82.0.0.0/8        # 12519 hits
    - 73.0.0.0/8        # 12960 hits
    - 79.0.0.0/8        # 14021 hits
    - 49.0.0.0/8        # 14181 hits
    - 84.0.0.0/8        # 14544 hits
    - 90.0.0.0/8        # 15004 hits
    - 88.0.0.0/8        # 15370 hits
    - 95.0.0.0/8        # 15467 hits
    - 172.0.0.0/8       # 18911 hits
    - 176.0.0.0/8       # 18996 hits
    - 92.0.0.0/8        # 19534 hits
    - 86.0.0.0/8        # 21956 hits

https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/d021846bdfc9f5c03ed7b5e73546cd964c2e2bef%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index e6d7a84..4108f94 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Thu, Mar 26, 7:21 PM · Beta-Cluster-Infrastructure
bd808 changed the status of T420299: Maps don't display on the beta cluster; tile server offline on deployment-maps-master02.deployment-prep.eqiad1.wikimedia.cloud from Open to In Progress.
Thu, Mar 26, 6:10 PM · Essential-Work, Content-Transform-Team (Work In Progress), Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

IP range blocks are cooked. The residential proxy users have found us. In the last 14 hours (since midnight UTC when logs roll over) Beta handled 571,849 requests. When I bucket traffic into /24 network blocks (x.y.z.* ranges) we have seen traffic from 91,229 separate networks. 33,440 of those networks sent 2 requests. No IP based rate limit or range block is going to touch this.

Thu, Mar 26, 2:51 PM · Beta-Cluster-Infrastructure

Wed, Mar 25

bd808 closed T267435: Beta cluster seems to be extremely slow for logged in user during page navigation as Declined.

let's stop chasing this one.

Wed, Mar 25, 11:12 PM · Release-Engineering-Team (Radar), Traffic, SRE, Beta-Cluster-Infrastructure
bd808 updated subscribers of T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

I nerd sniped @thcipriani into looking at this and he has spotted a pattern in the traffic that looks like something we could block on, but unfortunately it is not IP ranges and that's pretty much the only blocking tool we have at the moment. The traffic pattern that Tyler found is spread across 112,094 /24 networks, most of which only send a few hundred requests, but in total making 691,076 requests to the Beta servers in the last day. We need the https://wikitech.wikimedia.org/wiki/Requestctl stack in Beta.

Wed, Mar 25, 10:59 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

212.237.120.0/22 is KurdEstanNetSubnet7
https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/5f0c3aa390e9d81602be7a5a9d16543e6ac18f82%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index b40fc94..e6d7a84 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Wed, Mar 25, 10:36 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

74.243.224.0/19 is still more Microsoft
https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/d4477039a0a591c5cdc3c7531e24c2baff6ff9cb%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index 3c69e1c..b40fc94 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Wed, Mar 25, 10:28 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

2.59.220.0/22 is CZ-ONEHOSTPLANET-20190327
https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/a88b7a7c74f328f6637a06c0561f2e91a8904ae8%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index e70f408..3c69e1c 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Wed, Mar 25, 10:23 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

188.166.0.0/17 is EU-DIGITALOCEAN-NL1
https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/62cb0aea5eed0d0aa3e19e085b338b56ad9b670d%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index 6e5b5ed..e70f408 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Wed, Mar 25, 9:42 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

74.235.192.0/18 is still more Microsoft.
https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/1f1f276801415e198730cac0cda7cd2b2f039b28%5E%21/#F0

diff --git a/deployment-prep/_.yaml b/deployment-prep/_.yaml
index 326a730..6e5b5ed 100644
--- a/deployment-prep/_.yaml
+++ b/deployment-prep/_.yaml
Wed, Mar 25, 9:26 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

Let's block 57.141.0.0/16 which is facebook and has sent 98,072 requests today. That is the highest volume from a /16 today by an order of magnitude.

Wed, Mar 25, 9:21 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

I am experiencing the same issues for a month. This has been happening on nl.wikipedia.beta too (which is normal since it uses the same cache instance)

Wed, Mar 25, 4:48 PM · Beta-Cluster-Infrastructure
bd808 added a comment to T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21.

In a quick chat with @hashar I asked about the cut-over from the Java 17 compatible Jenkins to Java 21 compatible Jenkins. On a Bookworm host we have the ability to choose between Java 17 and Java 21 via hiera settings. So we can build a Bookworm + Java 17 instance, connect it to the Java 17 Jenkins, and then when the Java 21 Jenkins is ready we can change the hiera to get Java 21 and connect with the Java 21 Jenkins.

Wed, Mar 25, 4:10 PM · User-bd808, Release-Engineering-Team, Beta-Cluster-Infrastructure
bd808 changed the subtype of T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21 from "Task" to "Feature Request".
Wed, Mar 25, 4:02 PM · User-bd808, Release-Engineering-Team, Beta-Cluster-Infrastructure
bd808 renamed T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21 from Rebuild deployment-deploy04 with Bookworm to get Java 21 to Replace deployment-deploy04 with a Bookworm instance with Java 21.
Wed, Mar 25, 4:02 PM · User-bd808, Release-Engineering-Team, Beta-Cluster-Infrastructure
bd808 claimed T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21.
Wed, Mar 25, 4:01 PM · User-bd808, Release-Engineering-Team, Beta-Cluster-Infrastructure
bd808 added a comment to T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21.

We need to build a new instance for T401839: Migrate deployment-prep away from Debian Bullseye to Bookworm/Trixie too, so the quicker thing is probably to do that now and also unblock the Jenkins stuff. We can revisit T256168: Move beta cluster automatic deployment to a dedicated infrastructure at another time. I do think this is a good opportunity to start on T394316: Use infrastructure as code techniques to rebuild the Beta Cluster.

Wed, Mar 25, 4:00 PM · User-bd808, Release-Engineering-Team, Beta-Cluster-Infrastructure
bd808 added a comment to T256168: Move beta cluster automatic deployment to a dedicated infrastructure.

Having something like Jenkins to make it easier to see job status and trends would be nice, but we could probably do everything on the deployment server itself too.

Wed, Mar 25, 3:44 PM · Continuous-Integration-Infrastructure, Quality-and-Test-Engineering-Team (Test Infrastructure), Jenkins, Continuous-Integration-Config, Beta-Cluster-Infrastructure
bd808 raised the priority of T256168: Move beta cluster automatic deployment to a dedicated infrastructure from Low to Medium.
Wed, Mar 25, 3:40 PM · Continuous-Integration-Infrastructure, Quality-and-Test-Engineering-Team (Test Infrastructure), Jenkins, Continuous-Integration-Config, Beta-Cluster-Infrastructure
bd808 added a subtask for T401839: Migrate deployment-prep away from Debian Bullseye to Bookworm/Trixie: T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21.
Wed, Mar 25, 3:26 PM · Epic, Release-Engineering-Team (Priority Backlog 📥), Cloud-VPS (Debian Bullseye Deprecation), Beta-Cluster-Infrastructure
bd808 added a parent task for T421244: Replace deployment-deploy04 with a Bookworm instance with Java 21: T401839: Migrate deployment-prep away from Debian Bullseye to Bookworm/Trixie.
Wed, Mar 25, 3:26 PM · User-bd808, Release-Engineering-Team, Beta-Cluster-Infrastructure
bd808 added a comment to T420599: Inconsistent OAuth endpoint for Wikimedia Global Account (SUL) across services.

My question is: should Bitu and Striker also be reconfigured to use MediaWiki to enable this feature there? Or are blocks only useful as an anti-abuse measure for Phabricator, with no real usefulness for Striker?

Wed, Mar 25, 3:13 PM · Wikimedia-Phabricator-Extensions, cloud-services-team, Infrastructure-Foundations, Striker, Bitu, Phabricator
bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.
In T420833, @AlexisJazz wrote:

Unsure if related, but these ping results are strange? Why are different domains answering?

Wed, Mar 25, 12:02 AM · Beta-Cluster-Infrastructure

Tue, Mar 24

bd808 added a comment to T420833: Beta cluster is being crawled to death by bot traffic coming from residential proxies.

Still more Microsoft ranges found in the last 24 hours of logs:

Tue, Mar 24, 11:38 PM · Beta-Cluster-Infrastructure
bd808 changed the subtype of T17583: Enable importing across all Wikimedia projects from "Task" to "Feature Request".

I sent an email to wikitech-l to see if folks are still interested in this functionality and if anyone is excited to take on the remaining work. Let's see what happens.

Tue, Mar 24, 11:27 PM · Patch-Needs-Improvement, User-notice, MW-1.27-release-notes, MW-1.27-release (WMF-deploy-2015-09-29_(1.27.0-wmf.1)), MediaWiki-Core-Snapshots, Wikimedia-Site-requests
bd808 added a comment to T17583: Enable importing across all Wikimedia projects.

When I poked @TTO about T410109 as I was wondering if the easy fix was going to be to rollback the config changes that have been in Beta Cluster for this task since 2015 they said:

Tue, Mar 24, 10:29 PM · Patch-Needs-Improvement, User-notice, MW-1.27-release-notes, MW-1.27-release (WMF-deploy-2015-09-29_(1.27.0-wmf.1)), MediaWiki-Core-Snapshots, Wikimedia-Site-requests
bd808 added a comment to T17583: Enable importing across all Wikimedia projects.

T410109: Special:Import form loses configuration data when $wgImportSources contains duplicate subprojects has fixed a bug related to the past work on this by @TTO that would have been a production deployment blocker.

Tue, Mar 24, 10:14 PM · Patch-Needs-Improvement, User-notice, MW-1.27-release-notes, MW-1.27-release (WMF-deploy-2015-09-29_(1.27.0-wmf.1)), MediaWiki-Core-Snapshots, Wikimedia-Site-requests
bd808 moved T410109: Special:Import form loses configuration data when $wgImportSources contains duplicate subprojects from To Do to Needs Review/Feedback on the User-bd808 board.
Tue, Mar 24, 10:12 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), MediaWiki-Core-Snapshots, MediaWiki-HTMLForm, Beta-Cluster-reproducible, User-bd808, Beta-Cluster-Infrastructure
bd808 added a comment to T410109: Special:Import form loses configuration data when $wgImportSources contains duplicate subprojects.

What happens now?:

Screenshot 2026-03-24 at 15.56.09.png (1×1 px, 179 KB)

Tue, Mar 24, 9:58 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), MediaWiki-Core-Snapshots, MediaWiki-HTMLForm, Beta-Cluster-reproducible, User-bd808, Beta-Cluster-Infrastructure