This is something I've been doing with the help of @akosiaris recently, based on discussions in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/498796/ and some other commits, but it essentially boils down to this:
Let's abolish any concept of a special_hosts variable for multiple different types of host, and have all uses of it get the data through their parameters -> profiles -> hiera etc. It's data that we should be able to customise but its current location in network::constants makes that difficult.
There's also the secondary thing of the special hosts macros in `modules/base/templates/firewall/defs.erb` which is being dealt with in some of these patches where convenient.
It was rather easy for the first few:
[*] cumin_masters - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/499355/
[*] maintenance_hosts- https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/502499/
[*] bastion_hosts - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/502607/ - also https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/502612/ for moving it out of ferm macros
[x] caches - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/502630/
But as Alexandros pointed out, it gets more difficult when it comes to the remaining ones:
[] monitoring_hosts - ~~monitoring_hosts gets used in standard::ntp, which is included in ::standard, which seems to be included in over 200 different places. What hosts are not using standard? Is there a good reason for this to be under standard? Does standard make sense given profile::base? @aborrero may solve this in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/506614/ ~~ https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/506672/
[] deployment_hosts - ~~gets used under ssh::server, which is included in ::ssh but also profile::base, and profile::base creates it like this: `create_resources('class', {'ssh::server' => $ssh_server_settings})`, with `$ssh_server_settings = hiera('profile::base::ssh_server_settings', {}),` - we could do this by duplicating the list inside profile::base::ssh_server_settings, but doing it without duplicating the list might get tricky.~~ https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/507116/
[] mysql_root_clients - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505407/ - unfortunately there is a profile that is a define which is difficult to change.
labs only:
[] cumin_real_masters - this one might go away naturally with T219421, though we could probably get rid of it now given the cumin_masters patch above. Theoretically I could make it unused myself, but I wouldn't be able to test it directly myself on bastion-restricted due to the login restriction there which obviously I'm not going to remove.
prod only stuff, mostly analytics stuff or things that haven't been necessary within labs yet:
[*] puppet_frontends - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505356/
[*] kafka_brokers_main - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/
[*] kafka_brokers_analytics - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/
[*] kafka_brokers_jumbo - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/
[*] kafka_brokers_logging - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/
[*] zookeeper_hosts_main - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/
[*] hadoop_masters - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/
[*] druid_analytics_hosts - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505366/
[*] druid_public_hosts - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/505373/