This is done!
The old identity has been cleaned up, closing.
So, the consensus is that access to stat1006 is not neeeded and instead access to the DB testing host should be requested (following https://wikitech.wikimedia.org/wiki/MariaDB#Testing_servers) instead?
helm-diff also needs to be built for buster-wikimedia
This task evolved from an access request for analytics-privatedata-users towards a wider discussion about testing DB queries.
You have an updated NDA in our records, so that's covered.
contint2001 has been reimaged with buster using the wmf-auto-reimage cookbook.
puppet on the first run, and therefore also the reimage cookbook, fail though.
among the current issues are now:
- Error: Could not find command '/usr/bin/helm'
Exactly, the OpenBLAS available in Debian Buster (which stat1008 uses provides CPU-optimized computation kernels). I've just installed apache2-utils on the host, feel free to also ping me on IRC (moritzm) if you need other packages installed (we'll be reimaging the server anyway after the tests are done).
Closing the task, please reopen if there are any issues.
Tue, Apr 7
idp-test2001.wikimedia.org has been created, rest of the setup is handled via T233930
Superset needs an internal user created, this is handled by the analytics team, reassigning to Luca for further setup
Confirmed, firstname.lastname@example.org is the correct address for Janis.
@Aklapper I have just created your Kerberos account. You will have received a mail to your wikimedia.org address with instructions how to log in and change the initial/temporary password.
Mon, Apr 6
@fgiunchedi Is there anything left for this ticket? Can it be closed?
Fri, Apr 3
@elukey: To help SREs on Clinic Duty figure out whether adding someone to a group also needs a Kerberos account, let's annotate the headers in data.yaml?
I successfully tested import and removals of a dummy build (hello) and fixed up the sending of status mails.
Thu, Apr 2
Wed, Apr 1
This is an m1.small. Maybe the instance size is just too low for recent versions?
Looks like that for the hosts that will require TokuDB (tendril and analytics dbstore) we need to find a way to use jemalloc on start if we want to keep using TokuDB
Tue, Mar 31
BTW, nothing in puppet.git was specifically meant to only compile on puppet 5/ facter 3 yet, we don't use specific new features yet, it's probably rather a case of a bug fixed in the new versions which is broken in f2/p4.
I pushed a patch yesterday which used structured facts i.e. facts['networking']['interfaces'] which depends on facter 3. but otherwise i think this is correct
Now that the puppet repo /requires/ puppet5.5/facter3 to compile catalogs, those hosts will be broken until puppet is manually upgraded.
Yeah, what Paladox wrote, on hosts which had Puppet broken before the facter3/puppet5 Hiera setting was applied, you can add this to sources.list:
Fri, Mar 27
This is done
Thu, Mar 26
I think I see the issue here, @Jpita was added with the jpita-ctr@ account before (uid=josepita) while the current one is uid=jpita.
I think we just need to replace the LDAP membership of the wmf group from uid=josepita to uid=jpita and update the reference in data.yaml in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/admin/data/data.yaml#4045 for the username and the email.
Adding @MoritzMuehlenhoff for confirmation that this is the right course of action in this specific case.
The Cumin update is something that will happen next Q, but if you need it earlier, we can backport it as well?
Wed, Mar 25
Nothing can be blocked on 566383, it's entirely a cleanup change.
Tue, Mar 24
This is complete
Mon, Mar 23
Fri, Mar 20
I had a closer look into a rebuild of OpenBlas for Skylake/avx512, but it turns out this isn't actually needed thanks to our distro of choice :-)
Is is possible to leave it open for now and I will notify you as soon as I know? because of the current situation we needed to adapt and be flexible so that I cannot give you a fixed date anymore.
Thu, Mar 19
Wed, Mar 18
Another low-hanging fruit is to reduce the SSH check for the mgmts I think: It currently runs every minute, but the non-avail of the mgmt sshd has no end-user impact, so checking them hourly should be good enough? That would slash the 30087 checks from above by a lot.
Mon, Mar 16
Can the idp redirect to https? What happens when this is configured?
I'll revert 576921 (that was a leftover of testing), but with the service ID pointing to 443 (and CASRootProxiedAs set to https://cas-logstash.wikimedia.org (as Envoy only goes one way and other it would report the http URI as the service ID), it still fails within the bundled Bootstrap copy:
I think deploying it on Buster will be unproblematic, the current host is already on Stretch, so the big incompatibilities between PHP 5 and 7 are already addressed. Racktables is also still maintained (last maintenance release in November 2019)
Fri, Mar 13
@santhosh When you've setup your test environment and want to test OpenBLAS optimised for the CPU architecture of stat1008, let me know, I can help create a custom build and deploy it.
Thu, Mar 12
@elukey: I don't even think we need additional changes? E.g. an-launcher1001 is on Buster and uses profile::hadoop::common with OpenJDK 8, so this is already all implemented.
In addition to what Faidon said: Test results from Cloud VPS are not useful here; OpenBlas contains assembly-optimised functions for specific CPU models and through virtualisation these are no longer fully effective (the virt hosts don't expose all the CPU features by default compared to baremetal). Also, on Cloud VPS the test VMs are not the only tenant, so it highly depends on the load of the virt host.
profile::java::analytics uses the Java 8 forward port on Buster, so that part should be fine. Wrt Kerberos there should also be no real issues I can think of.
Wed, Mar 11
Note that there's a number of scripts blocked by OS runtime dependencies, e.g. various LDAP scripts are blocked until mwmaint* and cumin* are reimaged to Buster (no python3-ldap on Stretch).