Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | • chasemp | T144494 New instances carnt connect to remote mysql setup by another instance | |||
Resolved | Paladox | T141803 fix puppet issues when applying role::gerrit::server in labs | |||
Resolved | • chasemp | T142440 Request increased quota (floating-IP) for git labs project | |||
Resolved | Andrew | T142528 please associate gerrit-01.wmflabs.org with 208.80.155.149 |
Event Timeline
Getting error
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Must pass ipv4 to Class[Role::Gerrit::Server] on node gerrit-test3.git.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
hosts/lead.yaml:role::gerrit::server::ipv4: '208.80.154.85'
You'll have to set a value for ipv4 in Hiera. In Labs you can either do that in the repo or on the special wiki page.
I now get
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Must pass ipv6 to Class[Role::Gerrit::Server] on node gerrit-test3.git.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Now I get
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find data item gerrit::host in any Hiera data file and no default supplied at /etc/puppet/modules/role/manifests/gerrit/server.pp:7 on node gerrit-test3.git.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Alright, that's the same thing, just IPv6 instead of IPv4
hosts/lead.yaml:role::gerrit::server::ipv6: '2620:0:861:3:208:80:154:85'
Now I get
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Must pass replication to Class[Gerrit::Jetty] at /etc/puppet/modules/gerrit/manifests/init.pp:9 on node gerrit-test3.git.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
I copied https://github.com/wikimedia/operations-puppet/blob/854e04b8f21502cbcbf3043ac0b264b26d5b1076/hieradata/role/common/gerrit/server.yaml to https://github.com/wikimedia/operations-puppet/blob/854e04b8f21502cbcbf3043ac0b264b26d5b1076/hieradata/role/common/gerrit/server.yaml not all of it but some.
I now get error
Notice: /Stage[main]/Role::Gerrit::Server/Interface::Ip[role::gerrit::server_ipv6]/Exec[ip addr add /128 dev eth0]/returns: Error: an inet prefix is expected rather than "/128".
Error: ip addr add /128 dev eth0 returned 1 instead of one of [0,2]
Error: /Stage[main]/Role::Gerrit::Server/Interface::Ip[role::gerrit::server_ipv6]/Exec[ip addr add /128 dev eth0]/returns: change from notrun to 0 2 failed: ip addr add /128 dev eth0 returned 1 instead of one of [0,2]
It does not let you get away with setting a blank value for IPv6. It wants to see a real IP there.
you can try using one from 2001:0DB8::/32 which is reserved for testing in rfc3849
Now I get
Error: Could not set uid on user[gerrit2]: Execution of '/usr/sbin/usermod -u 444 gerrit2' returned 6: usermod: user 'gerrit2' does not exist in /etc/passwd
The user gerrit2 cannot be created in labs due to it probably already being created in ldap.
According to this error
Error: Could not set uid on user[gerrit2]: Execution of '/usr/sbin/usermod -u 444 gerrit2' returned 6: usermod: user 'gerrit2' does not exist in /etc/passwd
root@gerrit-test3:/home/paladox# id gerrit2
uid=2069(gerrit2) gid=1002(nda) groups=1005(labsadminbots),1002(nda)
@demon how to handle puppetized system users in labs when they conflict with LDAP users?
Change 302356 had a related patch set uploaded (by Chad):
Gerrit: Default to no replication
Change 302491 had a related patch set uploaded (by Dzahn):
gerrit: ensure symlink /etc/default/gerritcodereview
Hitting error
root@gerrit-test3:/var/lib/gerrit2/review_site/logs# journalctl -xn
- Logs begin at Mon 2016-08-01 18:08:54 UTC, end at Tue 2016-08-02 18:49:43 UTC. --
Aug 02 18:49:25 gerrit-test3 puppet-agent[1386]: (/Stage[main]/Apache/Service[apache2]) Dependency Service[gerrit] has failures: true
Aug 02 18:49:25 gerrit-test3 puppet-agent[1386]: (/Stage[main]/Apache/Service[apache2]) Skipping because of failed dependencies
Aug 02 18:49:25 gerrit-test3 puppet-agent[1386]: (/Stage[main]/Gerrit::Proxy/Letsencrypt::Cert::Integrated[gerrit]/Exec[acme-setup-acme-gerrit]) Dependency Service[gerrit] has failures: true
Aug 02 18:49:25 gerrit-test3 puppet-agent[1386]: (/Stage[main]/Gerrit::Proxy/Letsencrypt::Cert::Integrated[gerrit]/Exec[acme-setup-acme-gerrit]) Skipping because of failed dependencies
Aug 02 18:49:25 gerrit-test3 puppet-agent[1386]: Finished catalog run in 11.10 seconds
Aug 02 18:49:26 gerrit-test3 sudo[1385]: pam_unix(sudo:session): session closed for user root
Aug 02 18:49:42 gerrit-test3 sudo[2153]: diamond : TTY=unknown ; PWD=/ ; USER=puppet ; COMMAND=list /bin/cat /var/lib/puppet/state/last_run_summary.yaml
Aug 02 18:49:43 gerrit-test3 sudo[2154]: diamond : TTY=unknown ; PWD=/ ; USER=puppet ; COMMAND=/bin/cat /var/lib/puppet/state/last_run_summary.yaml
Aug 02 18:49:43 gerrit-test3 sudo[2154]: pam_unix(sudo:session): session opened for user puppet by (uid=0)
Aug 02 18:49:43 gerrit-test3 sudo[2154]: pam_unix(sudo:session): session closed for user puppe
root@gerrit-test3:/var/lib/gerrit2/review_site/logs# bash -x /etc/init.d/gerrit start
+ test 1 -gt 0
+ ACTION=start
+ shift
+ test 0 -gt 0
+ test -z ''
+ NO_START=0
+ test -z ''
+ START_STOP_DAEMON=1
+ test -f /etc/default/gerritcodereview
+ . /etc/default/gerritcodereview
++ GERRIT_SITE=/var/lib/gerrit2/review_site
++ GERRIT_WAR=/var/lib/gerrit2/review_site/bin/gerrit.war
+ test -z ''
+ TMP=/tmp
+ TMPJ=/tmp/j3426
+ GERRIT_INSTALL_TRACE_FILE=etc/gerrit.config
+ type git
+ : OK
+ test -z /var/lib/gerrit2/review_site
+ test -z /var/lib/gerrit2/review_site
++ pwd
+ INITIAL_DIR=/var/lib/gerrit2/review_site/logs
+ cd /var/lib/gerrit2/review_site
++ pwd
+ GERRIT_SITE=/var/lib/gerrit2/review_site
+ GERRIT_CONFIG=/var/lib/gerrit2/review_site/etc/gerrit.config
+ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
+ test -r /var/lib/gerrit2/review_site/etc/gerrit.config
+ GERRIT_PID=/var/lib/gerrit2/review_site/logs/gerrit.pid
+ GERRIT_RUN=/var/lib/gerrit2/review_site/logs/gerrit.run
+ GERRIT_TMP=/var/lib/gerrit2/review_site/tmp
+ export GERRIT_TMP
+ JAVA_HOME_OLD=
++ get_config --get container.javaHome
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--get = x--int
++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --get container.javaHome
+ JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/jre
+ test -z /usr/lib/jvm/java-7-openjdk-amd64/jre
+ test -z /usr/lib/jvm/java-7-openjdk-amd64/jre
+ test -z '' -a -n /usr/lib/jvm/java-7-openjdk-amd64/jre -a -x /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java -a '!' -d /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
+ JAVA=/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
+ test -z /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
+ test -z ''
+ JSTACK=/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/jstack
++ get_config --get-all container.javaOptions
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--get-all = x--int
++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --get-all container.javaOptions
+ GERRIT_OPTIONS=
+ test -n ''
++ get_config --get container.heapLimit
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--get = x--int
++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --get container.heapLimit
+ GERRIT_MEMORY=28g
+ test -n 28g
+ JAVA_OPTIONS=' -Xmx28g'
++ get_config --int core.packedGitOpenFiles
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--int = x--int
+++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --int core.packedGitOpenFiles
++ n=4096
++ test x0 = x4096
++ echo 4096
+ GERRIT_FDS=4096
+ test -z 4096
++ expr 4096 + 4096
+ GERRIT_FDS=8192
+ test 8192 -lt 1024
++ get_config --get container.user
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--get = x--int
++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --get container.user
+ GERRIT_USER=gerrit2
+ ulimit -c 0
+ ulimit -d unlimited
+ ulimit -f unlimited
+ ulimit -m
+ ulimit -m unlimited
+ ulimit -n 8192
+ ulimit -t unlimited
+ ulimit -v unlimited
+ ulimit -x
+ ulimit -x unlimited
+ test -z /var/lib/gerrit2/review_site/bin/gerrit.war
+ test -z /var/lib/gerrit2/review_site/bin/gerrit.war
+ test -z /var/lib/gerrit2/review_site/bin/gerrit.war -a -n gerrit2
+ test -z /var/lib/gerrit2/review_site/bin/gerrit.war
+ test -z gerrit2
+ RUN_ARGS='-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site'
++ get_config --bool container.slave
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--bool = x--int
++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --bool container.slave
+ test '' = true
++ get_config --get-all container.daemonOpt
++ test -f /var/lib/gerrit2/review_site/etc/gerrit.config
++ test x--get-all = x--int
++ git config --file /var/lib/gerrit2/review_site/etc/gerrit.config --get-all container.daemonOpt
+ DAEMON_OPTS=
+ test -n ''
+ test -n ' -Xmx28g'
+ RUN_ARGS=' -Xmx28g -jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site'
+ test -x /usr/bin/perl
+ export JAVA
+ RUN_EXEC=/usr/bin/perl
+ RUN_Arg1=-e
+ RUN_Arg2='$x=$ENV{JAVA};exec $x @ARGV;die $!'
+ RUN_Arg3='-- GerritCodeReview'
+ case "$ACTION" in
+ printf %s 'Starting Gerrit Code Review: '
Starting Gerrit Code Review: + test 1 = 0
+ test -z 0
++ date +%s
+ RUN_ID=1470165935.3426
+ RUN_ARGS=' -Xmx28g -jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site --run-id=1470165935.3426'
+ test 1 = 1
+ type start-stop-daemon
+ test 0 = 0
+ CH_USER='-c gerrit2'
+ start-stop-daemon -S -b -c gerrit2 -p /var/lib/gerrit2/review_site/logs/gerrit.pid -m -d /var/lib/gerrit2/review_site -a /usr/bin/perl -- -e '$x=$ENV{JAVA};exec $x @ARGV;die $!' -- GerritCodeReview -Xmx28g -jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site --run-id=1470165935.3426
+ : OK
+ test 0 = 0
++ cat /var/lib/gerrit2/review_site/logs/gerrit.pid
+ PID=3449
+ test -f /proc/3449/oom_score_adj
+ echo -1000
+ TIMEOUT=90
+ sleep 1
+ running /var/lib/gerrit2/review_site/logs/gerrit.pid
+ test -f /var/lib/gerrit2/review_site/logs/gerrit.pid
++ cat /var/lib/gerrit2/review_site/logs/gerrit.pid
+ PID=3449
+ ps -p 3449
+ return 1
+ echo FAILED
FAILED
+ exit 1
Getting error
root@gerrit-test3:/var/lib/gerrit2# /usr/bin/java -jar gerrit.war reindex -d review_site --threads 4
fatal: DbInjector failed
fatal: Unable to determine SqlDialect
fatal: caused by com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
fatal:
fatal: The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
fatal: caused by java.net.ConnectException: Connection refuse
It randomly creates your db password for you in gerrit.
You will find the password in /var/lib/gerrit2/review_site/etc/secure.config
You will also need to install MySQL before hand.
and follow https://gerrit-review.googlesource.com/Documentation/install.html#createdb_mysql
remember to replace the db user with gerrit not gerrit2 for now.
Change 303146 had a related patch set (by Paladox) published:
Gerrit: Support http only, configured by a config
Change 303435 had a related patch set uploaded (by Paladox):
Gerrit: Support labs https
Yay, we now have
https://gerrit.git.wmflabs.org/r/#/q/status:open
working. It should not need any manual hacks anymore.
But we want to proof this by deleting the instance one more time and recreating it by applying the puppet role and nothing else.
Change 303146 abandoned by Chad:
Gerrit: Support http only, configured by a config
Reason:
Gerrit needs HTTPS in all environments, per IRC and other discussions. Our puppet manifests are written for letsencrypt support out of the box
Nothing much now.
We just need to go through it again. I.E. retest it by deleting test instance and recreating it.
sooo.. we did this and re-created it one more time to proof everything is actually fixed now.
Also we split the DB part of it into a separate instance as we talked about before with Chad,
letting us easily re-create fully puppetized gerrits while leaving the DB backend the same.
- created new instance gerrit-mysql, following docs at P3939
- deleted instance gerrit-test3
- re-created gerrit-test3, following docs at P3637, configured to use role::gerrit::server
- ran puppet and got errors on P3957
- added security group to let gerrt-test3 connect to gerrit-mysql on mysql port 3306
- let the db server listen on it's LAN IP instead of just 127.0.0.1
- adjusted mysql GRANTs
- ran puppet again ..
blocked because one instance can't talk mysql to the other instance, which looks like it's caused by T142165
I'm not convinced that it's T142165:
- gerrit-test.git.eqiad.wmflabs and jenkins-slave-01.git.eqiad.wmflabs can traceroute gerrit-mysql and telnet gerrit-mysql 3306
- gerrit-test3.git.eqiad.wmflabs and alex-test.git.eqiad.wmflabs cannot
- All can sudo traceroute gerrit-mysql -T and sudo traceroute gerrit-mysql -I
gerrit-test3 uses kernel
Linux gerrit-test3 4.4.0-1-amd64 #1 SMP Debian 4.4.2-3+wmf3 (2016-07-28) x86_64 Debian GNU/Linux 8.5 (jessie)
and gerrit-MySQL uses
Linux gerrit-mysql 4.4.0-1-amd64 #1 SMP Debian 4.4.2-3+wmf2 (2016-05-11) x86_64 Debian GNU/Linux 8.5 (jessie)
gerrit-mysql can run telnet gerrit-mysql 3306 but fails on gerrit-test3 with the newer kernel.
root@gerrit-mysql:/home/paladox# telnet gerrit-mysql 3306
Trying 10.68.23.211...
Connected to gerrit-mysql.git.eqiad.wmflabs.
Escape character is '^]'.
GHost '10.68.23.211' is not allowed to connect to this MariaDB serverConnection closed by foreign host.
but on gerrit-test3 it just hangs on Trying 10.68.23.211...
But it dosent seem to be the kernel that is the problem.
Maybe the image but not sure.
Only two days ago I setup gerrit-mysql and that works but setting up gerrit-test3 today it seems it wont connect to mysql on gerrit-mysql which is strange but probably a bug in the image.
We went through the instructions one more time and edited them slightly.
It's done now, we have instructions how to get gerrit up and running in labs with just puppet that are repeatable.
We will now just dump the pastebin content on wikitech and we'll have docs to follow in the future.
This is all fixed now yay.
Link to the guide on the wiki is https://wikitech.wikimedia.org/wiki/How_to_setup_Gerrit_in_Labs