Thanks for your help, Riccardo! Given current time constraints, I'm afraid most of this work will take multiple months, but nevertheless, to see whether kea_python still works with the Kea packages provided by Debian, I felt it was time to bite the bullet and build the bindings manually.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Fri, May 10
Wed, Apr 17
Thank you for your reply! My comments:
In T351418#9703216, @Volans wrote:[...]
- The current setup that by default doesn't offer an IP to DHCP requests is by design, in response to data loss on servers that rebooted into PXE by error (the force PXE bit didn't get cleared by the BIOS or similar), see T251416 for context. This is easily fixable having a Netbox field that is empty by default and the reimage/dhcp cookbooks will set it to the PXE image to be installed next and will reset it after the DHCP step is done. The automation should surely not give PXE fields if that field is not set, probably not even an IP for safety reasons (to be discussed).
As I understand it, no server in production VLANs (that is: starting with {analytics,private,public} - excluding frack infrastructure?) should rely on DHCP for any purpose other than reimaging, because the IPv4 address will be set statically in d-i. For that reason, I can see why we would like to refuse DHCP requests if no syslinux path is provided by NetBox. I wouldn't classify it as a security measure against malevolent administrators, but rather as a failsafe to mitigate the impact of operator error.
Apr 9 2024
@ayounsi and I have discussed my first findings, and we thought it made sense to share them here.
Mar 28 2024
In T127717#9671526, @Andrew wrote:In T127717#9671489, @Southparkfan wrote:In T127717#9671034, @Andrew wrote:@Southparkfan We're trying to reduce use of Buster in cloud-vps, and two servers in 'auditlogging' are running Buster: syslog-server-04 and syslog-client04. My recollection is that they're redundant now that server-05 and client05 exist (and are running bookworm) -- is that right? Can the 04 VMs be removed?
The purpose on having syslog servers on multiple operating systems is to verify compatibility. As you might have seen, sometimes, rsyslog requires OS-specific changes to work properly.
If you don't mind potentially breaking Buster compatibility in the future, or if we should remove support right away, then these servers are OK to go.
That's a good point. We'll save these for a bit later in the Buster deprecation cycle. Thanks!
In T127717#9671034, @Andrew wrote:@Southparkfan We're trying to reduce use of Buster in cloud-vps, and two servers in 'auditlogging' are running Buster: syslog-server-04 and syslog-client04. My recollection is that they're redundant now that server-05 and client05 exist (and are running bookworm) -- is that right? Can the 04 VMs be removed?
Mar 27 2024
Haven't made a lot of progress on this, unfortunately. Scheduled for April.
Nov 20 2023
I'll work on this.
Nov 15 2023
Production migration from the gnutls driver to the openssl driver can be tracked in T324623.
Nov 3 2023
Oct 13 2023
Alternative to consider: injecting REDIRECTs for traffic meant for a VIP. See the second section at http://www.linuxvirtualserver.org/docs/arp.html. I haven't tested it and it requires some sort of Netfilter implementation on the realservers, but it avoids MTU-related issues (when tunneling traffic). Nevermind, ARP problem is solved at Wikimedia by not annoucing ARP. MTU is a challenge when using any type of encapsulation (in this case IPIP), but that's a different issue :)
Oct 3 2023
Aug 5 2023
May 25 2023
May 12 2023
In T336428#8844099, @cmooney wrote:Ok I think I see what the issue is. Looking at the kernel docs they state that "the max value from conf/{all,interface}/rp_filter is used when doing source validation on the {interface}."
This effectively means that this setting:
net.ipv4.conf.all.rp_filter = 1Nullifies the per-interface setting on eno12399np0:
net.ipv4.conf.eno12399np0.rp_filter = 0(...)
Feb 1 2023
I have expanded https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Auth_logging. The 'known limitations' section shows there is enough work to do, but to avoid a never ending task, I am fine with resolving this task when T127717#8505600 has been applied to Cloud VPS. I find the lack of monitoring to be a blocker too, though.
Dec 14 2022
Standalone puppetmasters are also affected by this Git update:
$ git push -f project_puppetmaster HEAD:production Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 remote: fatal: detected dubious ownership in repository at '/var/lib/git/operations/puppet' remote: To add an exception for this directory, call: remote: remote: git config --global --add safe.directory /var/lib/git/operations/puppet
Dec 7 2022
I have tested https://gerrit.wikimedia.org/r/c/operations/puppet/+/865731 by using rsyslog-openssl on one syslog client and one syslog server running buster + one syslog client and one syslog server running bullseye. All works as expected.
Status: we chose #3 (Let's Encrypt via acme-chief). We've gotten stuck on a bug in the gnutls driver for rsyslog: T324623
Background: for T127717, we went with Let's Encrypt certificates. Unlike the rather simple chain of trust for the Puppet CA (leaf certificate -> root certificate (Puppet CA)), Let's Encrypt certificates have an intermediate certificate in between. 'Because TLS' (certificates are terrible, I know), the clients need to receive all certificates but the root certificate (because that is in /etc/ssl/certs/ca-certificates.crt).
Dec 5 2022
@Andrew and I have spent this evening on the initial set up of two WMCS-wide syslog servers. Those work fine. However, this setup is broken for all Cloud VPS instances that do not use the central puppetmaster.
Jul 27 2022
In T36738#8106918, @Krinkle wrote:@Southparkfan Thanks. Your patch avoids use of $wgRequest which unblocks removal of that variable from core. However, this task represents the work for T32956 and would not be resolved unless other forms of global state are avoided as well. Whether we obtain WebRequest from $wgRequest or RequestContext::getMain() is a minor detail from this task's perspective.
I understand, eliminating $wgRequest was low-hanging fruit here. DI is still better than either using ::getMain().
I'm happy to accept it, but keep the task and checkbox open after this lands.
If you'd like to resolve the use of global WebRequest in ::inDebugMode(), I can recommend two approaches to try.
Now that we're at it, I'll try to get a durable solution; after all, I would like to reduce technical debt, not move it somewhere else. Assistance is needed to get me starting here, though :-). Your advice is welcome.
- Perhaps create a setDebugMode() method marked @internal that we'd call from load.php and possibly OutputPage.php. It could take WebRequest and Config as parameter. Doing this would actually highlight an issue which is that we appear to be reading the cookie even when on load.php which is potentially a problem even today. The cookie should instruct OutputPage to make load.php?debug=true requests, there is no need for it to read it directly. If it does, it might actually poison the cache.
Assumptions
- load.php (RL\Context) only needs to know if ?debug=<value> exists; cookies and config don't matter here.
- index.php emits HTML elements (sourcing resources from load.php) that may or may not contain ?debug, depending on the sixth argument of ResourceLoader::makeLoaderQuery(). This entry point is not interested in the presence of a ?debug parameter - inverse of load.php.
Jul 21 2022
In T127717#8092122, @Andrew wrote:Sorry for the slow response here. I also don't see a clear way to provision those certs, so I think relying on source IP is probably good for this pass. It's already the case that we can't fully 'trust' log messages originating from within cloud-vps projects; I suspect that the risk of a ddos attack is already present even if we have certified logs.
Please lmk if I'm missing a more obvious threat.
Cool! Not sure what the relationship with a DDoS is, though :). If you have time, you can review the patch above. I wasn't too sure about the locations of the default hiera: some things have to be defined in cloud.yaml, others in common/, ...
Jul 20 2022
(wrong task)
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/815776/ would alleviate the 'usage of $wgRequest global' concern, but a second pair of eyes is needed:
- Since [\MediaWiki\ResourceLoader\ResourceLoader]::inDebugMode() is a static function with neither WebRequest exposed via $this->getRequest() or [\MediaWiki\ResourceLoader\Context]->getRequest() available. Unless extensions can/should fetch a 'ResourceLoader object' (and therefore convert this function into a non-static one), I'm not sure how to refrain from this "last restort".
- Apart from OutputPage and OutputPageTest, [\MediaWiki\ResourceLoader\ResourceLoader]::makeCombinedStyles() is the only caller for OutputPage::transformCssMedia(), hence I couldn't refrain from using RequestContext::getMain() here either.
- According to T165176, RequestContext::getMain() can cause side-effects? Will that affect the ResourceLoader Context too?
Jul 18 2022
Nevermind, missed one occurrence :P
Jul 17 2022
Jul 16 2022
Jul 15 2022
Regarding the comment on PS1:
Jul 14 2022
Given that not every extension calls the functions that were mentioned in the description of this task, 99 of the subtasks have been closed as invalid. All tasks that have been left open make at least one call to one of these functions.
Affected functions not present.
Affected functions not present.
Affected functions not present.
Affected functions not present.
Affected functions not present.