Fri, Feb 21
One option would be to manually create /srv/druid symlinks on the existing installed base and then switch Puppet to use it, with buster reimages and hw refreshes, the underlying remaining uses of /var/lib/druid would vanish over time.
I think this bug can be closed in fact.
Thu, Feb 20
sretest100 is fine with me, but given that these are meant for various tests let's rather use an internal IP, unless @akosiaris has specific needs.
Thanks for opening a task. I've removed him from the nda and wmde groups.
I had missed the followup. sorry. These two spare hosts would be fine as test hosts!
The server went down with the following error today:
Wed, Feb 19
tungsten is currently running XHGui, once https://phabricator.wikimedia.org/T180761 is resolved it can be decommissioned.
Mon, Feb 17
Looking at "hadoop checknative" is opens/usr/lib/x86_64-linux-gnu/libcrypto.so which is a symlink to /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.2. But EVP_CIPHER_CTX_encrypting was only introduced in OpenSSL 1.1.0
Fri, Feb 14
With package_from_component() I don't think we need this any longer, it serves a similar purpose and proper dependencies can be defined.
Thu, Feb 13
@HMarcus Sure, we can do that. Let's do Thursday (2/20) - 7am PST, 4pm CET
How did the OVMF come about? Some snowflake setting from ealier tests or something we might run into again?
I think we can rule out a change in the upstream repository; initially I had the hunch that this could be caused by a stale mirror after the latest Stretch point release, but nothing changed to that deb (libgtk-3-common) since 24 Mar 2017 when it was uploaded initially to Debian, the expected hashes are also the same of what's currently found on the mirrors.
There are two different angles to consider:
Wed, Feb 12
Given that Luca also had an error during initial setup related to name resolution, this sounds like some error related to the DNS records for the new host?
Does this really need 8 GB RAM and 8 CPUs? The machine that this will replace (kraz) uses a single CPU (and hardly uses it) and has an average memory usage of 0.25G. I'm all for adding some headroom, but that seems a little excessive :-)
Tue, Feb 11
The LDAP replicas are critical to the wikimedia.org mail servers: We currently have a replication setup between two OpenLDAP servers in the production realm (ldap-corp1001.wikimedia.org in Virginia and ldap-corp2001.wikimedia.org in Texas) and ldap1.corp.wikimedia.org in the OIT network. The mail servers then query the ldap-corp* systems in production to determine whether a given @wikimedia.org address is legitimate or not.
This needs an entry in data.yaml, reopening
Mon, Feb 10
JFTR, this got fixed in tmpreaper via the 9.12 point release: https://packages.qa.debian.org/t/tmpreaper/news/20200130T211747Z.html and the 10.2 point release: https://packages.qa.debian.org/t/tmpreaper/news/20191013T191724Z.html
Yeah, separate VM is the idea. The specs looks good, we could probably even lower it a bit, but also won't hurt to have some head-room for further JunOS files etc. These will need a public IP.
ferm has been fixed in stretch-wikimedia and buster-wikimedia to properly resolve AAAA records with a fallback, if all jessie instances are gone from deployment-prep, this patch be be removed (if all stretch/buster hosts are running ferm 2.4-1+wmf2+deb10u1 or 2.4-1+wmf2+deb9u1)
Fri, Feb 7
This is confirmed working by Papaul when using the stretch-bootif tftpboot environment, closing.
Thu, Feb 6
The ethernet adapter is slightly different than the BCM5720 we otherwise already run on stretch. E.g. on ms-be2050 it reports as
This is done for quite a while, closing.
Wed, Feb 5
Why the "Networking Requirements: public"? With the repository split off, those should be fine with an internal IP.
Mon, Feb 3
I noticed that we have rc24 on mx1001 which is flagged for downgrade, should remaining hosts running 24 also be downgraded to rc19?
Jan 24 2020
Jan 23 2020
Both debmonitor instances now have two CPUs.
This task is from 2018, is that still an issue?
Jan 22 2020
torrelay1001 is being reclaimed to the spare pool via https://phabricator.wikimedia.org/T243390 (only pending DC ops steps like disk wipe)
@dpifke I've added you to cn=wmf, let me know if you run into any issues.
Jan 21 2020
Jan 20 2020
This is complete. The new Buster instances are urldownloader00 and the old jessie systems have been removed.
Jan 17 2020
We have role::webserver_misc_static (bromine/vega) for this.
Good catch! I'll review the difference between the Logrotate config shipped in the Debian config and our Puppet one, maybe we can simply stick with the Debian default entirely.
@jwang Your access is now enabled, let me know if you run into any issues logging in with SSH. If you have any specific questions wrt Hadoop access, best to ask the #wikimedia-analytics channel on IRC.
I created your Kerberos account. You should have a mail for your Kerberos account (required to access Hadoop) with further instructions.
Thanks Papaul! I think we don't need to pursue the "let's disable the unused port" option further, the current solution within the debian-installer addresses this just fine (plus if we disable the 10G port, it'll cause further effort down the road to re-enable this once we have a 10G rack).
Jan 16 2020
@jwang : I already enabled your LDAP access via the "wmf" group, the services listed at https://wikitech.wikimedia.org/wiki/LDAP/Groups#wmf_group can now be accessed.