Page MenuHomePhabricator

Create test Gerrit instance
Open, Stalled, LowPublic

Description

I'm in need for testing against the Gerrit API and have noticed that the test instances have been deleted. I'd like for a new instance so I can test, pretty please!

Event Timeline

Recently both the old buster Gerrit instance in devtools AND the newer bullseye test instance were deleted in T360964#9718714.

Deleting the buster instance was a positive to me, deleting the buster instance as well wasn't intended originally.

LSobanski moved this task from Incoming to Backlog on the collaboration-services board.

@BCornwall what specific features do you need as it may influence how quickly this can happen?

Mentioned in SAL (#wikimedia-cloud) [2024-05-01T23:59:03Z] <mutante> creating instance gerrit-bullseye (T363196)

Good news: We didn't lose all the Hiera data because most of it is in the repo (as opposed to web Hiera which gets deleted when an instance gets deleted if it's applied on role level).

Needs a few adjustments only.

~/repos/puppet/hieradata/cloud/eqiad1/devtools$ grep gerrit common.yaml 
  gerrit:
  gerrit/gerrit:
    repository: operations/software/gerrit
    repository: operations/software/gerrit/tools/gervert/deploy
profile::gerrit::config: 'gerrit.config.erb'
profile::gerrit::host: 'gerrit.devtools.wmcloud.org'
profile::gerrit::mask_service: false
profile::gerrit::ssh_allowed_hosts:
- gerrit-prod-1001.devtools.eqiad.wmflabs
profile::gerrit::replica_hosts: []
profile::gerrit::ipv6: ~
profile::gerrit::replication: {}
profile::gerrit::git_dir: /srv/gerrit/git
profile::gerrit::ssh_host_key: ssh_host_key
profile::gerrit::bacula: gerrit-repo-data
profile::gerrit::java_version: 8
profile::gerrit::daemon_user: 'gerrit2'
profile::gerrit::gerrit_site: "/var/lib/gerrit2/review_site"
profile::gerrit::scap_user: 'gerrit-deploy'
profile::gerrit::manage_scap_user: true
profile::gerrit::scap_key_name: 'gerrit'
  - '/etc/ssh/userkeys/%u.d/gerrit-scap'
profile::gerrit::use_acmechief: false
profile::gerrit::backups_enabled: false
profile::gerrit::backup_set: 'gerrit-repo-data'
profile::gerrit::active_host: 'gerrit-prod-1001.devtools.eqiad.wmflabs'
profile::gerrit::migration::data_dir: /srv/gerrit
          - 'gerrit-prod-1001.devtools.eqiad.wmflabs'

And the remaining keys on web Hiera can be retrieved from:

https://phabricator.wikimedia.org/rCLIP0817761ddbf3b926f8992d43a21210f82c9ec247

Change #1026195 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: set java_home and migration user in repo Hiera

https://gerrit.wikimedia.org/r/1026195

Change #1026197 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: update gerrit and phab instance names in default Hiera

https://gerrit.wikimedia.org/r/1026197

Change #1026195 merged by Dzahn:

[operations/puppet@production] gerrit: set java_home and migration user in repo Hiera

https://gerrit.wikimedia.org/r/1026195

@BCornwall what specific features do you need as it may influence how quickly this can happen?

I'm working on automated CR submission and didn't want to spam the active gerrit/channels with tests. For now I can just use the WIP feature to avoid notifications, I imagine. Thanks!

Change #1026197 merged by Dzahn:

[operations/puppet@production] devtools: update gerrit and phab instance names in default Hiera

https://gerrit.wikimedia.org/r/1026197

Change #1036764 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: rename hieradata host data, match new instance name

https://gerrit.wikimedia.org/r/1036764

Change #1036764 merged by Dzahn:

[operations/puppet@production] devtools: rename hieradata host data, match new instance name

https://gerrit.wikimedia.org/r/1036764

Change #1036765 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: update IP for gerrit test instance

https://gerrit.wikimedia.org/r/1036765

Change #1036765 merged by Dzahn:

[operations/puppet@production] devtools: update IP for gerrit test instance

https://gerrit.wikimedia.org/r/1036765

Change #1036767 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: update host name for new gerrit test instance

https://gerrit.wikimedia.org/r/1036767

Change #1036767 merged by Dzahn:

[operations/puppet@production] devtools: update host name for new gerrit test instance

https://gerrit.wikimedia.org/r/1036767

Change #1036771 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: add parameter to toggle lfs_replica_sync

https://gerrit.wikimedia.org/r/1036771

Change #1037574 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit/test: set lfs sync dest host to itself

https://gerrit.wikimedia.org/r/1037574

Change #1036771 merged by Dzahn:

[operations/puppet@production] gerrit: add parameter to toggle lfs_replica_sync

https://gerrit.wikimedia.org/r/1036771

Deploying the change above fixed a major puppet issue related to lfs data sync on the re-created test instance - while it was noop on prod machines

We are now skipping the lfs_replica_sync if in testing. It caused errors because we don't have multiple machines there like in production.

A good step forward.

Now we are on to the (expected) scap deploy problem on hosts that didn't have a deploy yet.

Error: Execution of '/usr/bin/scap deploy-local --repo gerrit/gerrit -D log_json:False' returned 70: 
Error: /Stage[main]/Gerrit/Scap::Target[gerrit/gerrit]/Package[gerrit/gerrit]/ensure: change from 'absent' to 'present' failed: Execution of '/usr/bin/scap deploy-local --repo gerrit/gerrit -D log_json:False' returned 70: 

Error: Execution of '/usr/bin/scap deploy-local --repo gervert/deploy -D log_json:False' returned 70: 
Error: /Stage[main]/Gerrit/Scap::Target[gervert/deploy]/Package[gervert/deploy]/ensure: change from 'absent' to 'present' failed: Execution of '/usr/bin/scap deploy-local --repo gervert/deploy -D log_json:False' returned 70:

It's probably T257317.

The manual fix should have been sudo -u gerrit-deploy scap deploy --init -Dblock_deployments:False in /srv/deployment/gerrit (and gervert) as we found last time (T257317#9762601).

But:

ERROR:deploy:deploy failed: <FileNotFoundError> [Errno 2] No such file or directory: '/srv/deployment/gerrit/.git/config-files'

Change #1037574 abandoned by Dzahn:

[operations/puppet@production] gerrit/test: set lfs sync dest host to itself

Reason:

solved by https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036771 instead

https://gerrit.wikimedia.org/r/1037574

Dzahn changed the task status from Open to In Progress.May 31 2024, 10:21 PM

The previous test instance (gerrit-prod-1001) was deleted because Gerrit authenticates user against the WMCS LDAP and that is a breach of the WMCS policy (I have documented it at T330312 ). I have shut it down for now.

We had already changed the authentication settings to local auth and solved that issue.

Unfortunately the instance wasn't just shut down but actually deleted too. Which is why we had to start from scratch.

FYI, the test instance isn't a priority for me any more, though it would be nice to have eventually!

Change #1081225 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools: do NOT bind service IP on gerrit test instances

https://gerrit.wikimedia.org/r/1081225

Change #1081225 merged by Dzahn:

[operations/puppet@production] cloud/devtools: do NOT bind service IP on gerrit test instances

https://gerrit.wikimedia.org/r/1081225

Change #1081244 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools: disable lfs data syncing on gerrit test instance

https://gerrit.wikimedia.org/r/1081244

Change #1081244 merged by Dzahn:

[operations/puppet@production] cloud/devtools: disable lfs data syncing on gerrit test instance

https://gerrit.wikimedia.org/r/1081244

Change #1081257 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: make parameter lfs_sync_dest optional

https://gerrit.wikimedia.org/r/1081257

Change #1081257 merged by Dzahn:

[operations/puppet@production] gerrit: make parameter lfs_sync_dest optional

https://gerrit.wikimedia.org/r/1081257

Change #1081270 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools: set a non-existing lfs data sync target

https://gerrit.wikimedia.org/r/1081270

Change #1081270 merged by Dzahn:

[operations/puppet@production] cloud/devtools: set a non-existing lfs data sync target

https://gerrit.wikimedia.org/r/1081270

Change #1081273 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools: turn gerrit lfs_sync_dest into an array

https://gerrit.wikimedia.org/r/1081273

Change #1081273 merged by Dzahn:

[operations/puppet@production] cloud/devtools: turn gerrit lfs_sync_dest into an array

https://gerrit.wikimedia.org/r/1081273

Change #1081277 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools/gerrit-bullseye: mask service, no monitoring

https://gerrit.wikimedia.org/r/1081277

Change #1081277 merged by Dzahn:

[operations/puppet@production] cloud/devtools/gerrit-bullseye: mask service, no monitoring

https://gerrit.wikimedia.org/r/1081277

Some progress here and fixed a couple things. next is to fix this though:

-- Journal begins at Tue 2024-07-02 14:47:32 UTC, ends at Thu 2024-10-17 22:57:37 UTC. --
Oct 17 22:57:37 gerrit-bullseye systemd[1]: Starting The Apache HTTP Server...
Oct 17 22:57:37 gerrit-bullseye apachectl[34461]: AH00526: Syntax error on line 45 of /etc/apache2/sites-enabled/50-gerrit-devtools-wmcloud-org.conf:
Oct 17 22:57:37 gerrit-bullseye apachectl[34461]: SSLCertificateFile: file '/etc/letsencrypt/live/gerrit.devtools.wmflabs.org/fullchain.pem' does not exist or is empty
Oct 17 22:57:37 gerrit-bullseye apachectl[34458]: Action 'start' failed.

Change #1081286 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools: set service IP to existing gerrit.devtools.wmcloud.org.

https://gerrit.wikimedia.org/r/1081286

In Horizon, we already have a floating IP and reverse DNS for gerrit from the past, gerrit.devtools.wmcloud.org.

Connected that to the bullseye instance.

Attempted to use certbot (which was installed by puppet), to fetch a cert for it.

Couldn't connect yet with the "temp webserver"-method.

Added and applied security rule to allow access, no luck yet. To be continued.

Instance shut down again until I continue.

Change #1081286 merged by Dzahn:

[operations/puppet@production] cloud/devtools: set service IP to existing gerrit.devtools.wmcloud.org.

https://gerrit.wikimedia.org/r/1081286

Dzahn changed the task status from In Progress to Stalled.EditedOct 21 2024, 7:07 PM

In production, we have 2 IPs per server. One host name and one service name. Puppet binds the service name as additional IP on the network interface.

In cloud, we would have to avoid that from happening, request a second IP on the interface (that is NOT a floating IP, doesnt seem self service but is supposed to be trivial in openstack)), then a floating IP would have to be associated to that secondary IP.

Then puppet would have to learn that there are 3 different IPs if in cloud and this would have to be done while avoiding a realm check.

Finally DNS/reverse DNS and security groups have to be adjusted and the cert has to be issued by certbot.

Also, I couldn't get this to work even with a simple webproxy and without any floating IPs (for http/https it should just work but didn't). And for the ssh service we would definitely still need the setup above.

Change #1087963 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: update gerrit user from gerrit2 to gerrit

https://gerrit.wikimedia.org/r/1087963

Change #1087963 merged by Dzahn:

[operations/puppet@production] devtools: update gerrit user from gerrit2 to gerrit

https://gerrit.wikimedia.org/r/1087963