I'm in need for testing against the Gerrit API and have noticed that the test instances have been deleted. I'd like for a new instance so I can test, pretty please!
Description
Details
Related Objects
- Mentioned In
- T257317: scap deploy --init on deployment server fails on first puppet run
T360964: replace buster machines in devtools project - Mentioned Here
- T330312: Address Gerrit WMCS instance authenticating against LDAP (breaching WMCS policy)
T257317: scap deploy --init on deployment server fails on first puppet run
rCLIP0817761ddbf3: Delete data for devtools gerrit-bullseye-test.devtools.eqiad1.wikimedia.cloud
T360964: replace buster machines in devtools project
Event Timeline
Recently both the old buster Gerrit instance in devtools AND the newer bullseye test instance were deleted in T360964#9718714.
Deleting the buster instance was a positive to me, deleting the buster instance as well wasn't intended originally.
@BCornwall what specific features do you need as it may influence how quickly this can happen?
Mentioned in SAL (#wikimedia-cloud) [2024-05-01T23:59:03Z] <mutante> creating instance gerrit-bullseye (T363196)
Good news: We didn't lose all the Hiera data because most of it is in the repo (as opposed to web Hiera which gets deleted when an instance gets deleted if it's applied on role level).
Needs a few adjustments only.
~/repos/puppet/hieradata/cloud/eqiad1/devtools$ grep gerrit common.yaml gerrit: gerrit/gerrit: repository: operations/software/gerrit repository: operations/software/gerrit/tools/gervert/deploy profile::gerrit::config: 'gerrit.config.erb' profile::gerrit::host: 'gerrit.devtools.wmcloud.org' profile::gerrit::mask_service: false profile::gerrit::ssh_allowed_hosts: - gerrit-prod-1001.devtools.eqiad.wmflabs profile::gerrit::replica_hosts: [] profile::gerrit::ipv6: ~ profile::gerrit::replication: {} profile::gerrit::git_dir: /srv/gerrit/git profile::gerrit::ssh_host_key: ssh_host_key profile::gerrit::bacula: gerrit-repo-data profile::gerrit::java_version: 8 profile::gerrit::daemon_user: 'gerrit2' profile::gerrit::gerrit_site: "/var/lib/gerrit2/review_site" profile::gerrit::scap_user: 'gerrit-deploy' profile::gerrit::manage_scap_user: true profile::gerrit::scap_key_name: 'gerrit' - '/etc/ssh/userkeys/%u.d/gerrit-scap' profile::gerrit::use_acmechief: false profile::gerrit::backups_enabled: false profile::gerrit::backup_set: 'gerrit-repo-data' profile::gerrit::active_host: 'gerrit-prod-1001.devtools.eqiad.wmflabs' profile::gerrit::migration::data_dir: /srv/gerrit - 'gerrit-prod-1001.devtools.eqiad.wmflabs'
And the remaining keys on web Hiera can be retrieved from:
https://phabricator.wikimedia.org/rCLIP0817761ddbf3b926f8992d43a21210f82c9ec247
Change #1026195 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] gerrit: set java_home and migration user in repo Hiera
Change #1026197 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] devtools: update gerrit and phab instance names in default Hiera
Change #1026195 merged by Dzahn:
[operations/puppet@production] gerrit: set java_home and migration user in repo Hiera
I'm working on automated CR submission and didn't want to spam the active gerrit/channels with tests. For now I can just use the WIP feature to avoid notifications, I imagine. Thanks!
Change #1026197 merged by Dzahn:
[operations/puppet@production] devtools: update gerrit and phab instance names in default Hiera
Change #1036764 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] devtools: rename hieradata host data, match new instance name
Change #1036764 merged by Dzahn:
[operations/puppet@production] devtools: rename hieradata host data, match new instance name
Change #1036765 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] devtools: update IP for gerrit test instance
Change #1036765 merged by Dzahn:
[operations/puppet@production] devtools: update IP for gerrit test instance
Change #1036767 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] devtools: update host name for new gerrit test instance
Change #1036767 merged by Dzahn:
[operations/puppet@production] devtools: update host name for new gerrit test instance
Change #1036771 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] gerrit: add parameter to toggle lfs_replica_sync
Change #1037574 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] gerrit/test: set lfs sync dest host to itself
Change #1036771 merged by Dzahn:
[operations/puppet@production] gerrit: add parameter to toggle lfs_replica_sync
Deploying the change above fixed a major puppet issue related to lfs data sync on the re-created test instance - while it was noop on prod machines
We are now skipping the lfs_replica_sync if in testing. It caused errors because we don't have multiple machines there like in production.
A good step forward.
Now we are on to the (expected) scap deploy problem on hosts that didn't have a deploy yet.
Error: Execution of '/usr/bin/scap deploy-local --repo gerrit/gerrit -D log_json:False' returned 70: Error: /Stage[main]/Gerrit/Scap::Target[gerrit/gerrit]/Package[gerrit/gerrit]/ensure: change from 'absent' to 'present' failed: Execution of '/usr/bin/scap deploy-local --repo gerrit/gerrit -D log_json:False' returned 70: Error: Execution of '/usr/bin/scap deploy-local --repo gervert/deploy -D log_json:False' returned 70: Error: /Stage[main]/Gerrit/Scap::Target[gervert/deploy]/Package[gervert/deploy]/ensure: change from 'absent' to 'present' failed: Execution of '/usr/bin/scap deploy-local --repo gervert/deploy -D log_json:False' returned 70:
It's probably T257317.
The manual fix should have been sudo -u gerrit-deploy scap deploy --init -Dblock_deployments:False in /srv/deployment/gerrit (and gervert) as we found last time (T257317#9762601).
But:
ERROR:deploy:deploy failed: <FileNotFoundError> [Errno 2] No such file or directory: '/srv/deployment/gerrit/.git/config-files'
Change #1037574 abandoned by Dzahn:
[operations/puppet@production] gerrit/test: set lfs sync dest host to itself
Reason:
solved by https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036771 instead
The previous test instance (gerrit-prod-1001) was deleted because Gerrit authenticates user against the WMCS LDAP and that is a breach of the WMCS policy (I have documented it at T330312 ). I have shut it down for now.
We had already changed the authentication settings to local auth and solved that issue.
Unfortunately the instance wasn't just shut down but actually deleted too. Which is why we had to start from scratch.
FYI, the test instance isn't a priority for me any more, though it would be nice to have eventually!
Change #1081225 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] cloud/devtools: do NOT bind service IP on gerrit test instances
Change #1081225 merged by Dzahn:
[operations/puppet@production] cloud/devtools: do NOT bind service IP on gerrit test instances
Change #1081244 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] cloud/devtools: disable lfs data syncing on gerrit test instance
Change #1081244 merged by Dzahn:
[operations/puppet@production] cloud/devtools: disable lfs data syncing on gerrit test instance
Change #1081257 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] gerrit: make parameter lfs_sync_dest optional
Change #1081257 merged by Dzahn:
[operations/puppet@production] gerrit: make parameter lfs_sync_dest optional
Change #1081270 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] cloud/devtools: set a non-existing lfs data sync target
Change #1081270 merged by Dzahn:
[operations/puppet@production] cloud/devtools: set a non-existing lfs data sync target
Change #1081273 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] cloud/devtools: turn gerrit lfs_sync_dest into an array
Change #1081273 merged by Dzahn:
[operations/puppet@production] cloud/devtools: turn gerrit lfs_sync_dest into an array
Change #1081277 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] cloud/devtools/gerrit-bullseye: mask service, no monitoring
Change #1081277 merged by Dzahn:
[operations/puppet@production] cloud/devtools/gerrit-bullseye: mask service, no monitoring
Some progress here and fixed a couple things. next is to fix this though:
-- Journal begins at Tue 2024-07-02 14:47:32 UTC, ends at Thu 2024-10-17 22:57:37 UTC. -- Oct 17 22:57:37 gerrit-bullseye systemd[1]: Starting The Apache HTTP Server... Oct 17 22:57:37 gerrit-bullseye apachectl[34461]: AH00526: Syntax error on line 45 of /etc/apache2/sites-enabled/50-gerrit-devtools-wmcloud-org.conf: Oct 17 22:57:37 gerrit-bullseye apachectl[34461]: SSLCertificateFile: file '/etc/letsencrypt/live/gerrit.devtools.wmflabs.org/fullchain.pem' does not exist or is empty Oct 17 22:57:37 gerrit-bullseye apachectl[34458]: Action 'start' failed.
Change #1081286 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] cloud/devtools: set service IP to existing gerrit.devtools.wmcloud.org.
In Horizon, we already have a floating IP and reverse DNS for gerrit from the past, gerrit.devtools.wmcloud.org.
Connected that to the bullseye instance.
Attempted to use certbot (which was installed by puppet), to fetch a cert for it.
Couldn't connect yet with the "temp webserver"-method.
Added and applied security rule to allow access, no luck yet. To be continued.
Instance shut down again until I continue.
Change #1081286 merged by Dzahn:
[operations/puppet@production] cloud/devtools: set service IP to existing gerrit.devtools.wmcloud.org.
In production, we have 2 IPs per server. One host name and one service name. Puppet binds the service name as additional IP on the network interface.
In cloud, we would have to avoid that from happening, request a second IP on the interface (that is NOT a floating IP, doesnt seem self service but is supposed to be trivial in openstack)), then a floating IP would have to be associated to that secondary IP.
Then puppet would have to learn that there are 3 different IPs if in cloud and this would have to be done while avoiding a realm check.
Finally DNS/reverse DNS and security groups have to be adjusted and the cert has to be issued by certbot.
Also, I couldn't get this to work even with a simple webproxy and without any floating IPs (for http/https it should just work but didn't). And for the ssh service we would definitely still need the setup above.
Change #1087963 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] devtools: update gerrit user from gerrit2 to gerrit
Change #1087963 merged by Dzahn:
[operations/puppet@production] devtools: update gerrit user from gerrit2 to gerrit