We experienced an incident with the aqs_next cluster before putting it into production.
When one of the 12 instances crashed, the AQS endpoint checks in Icinga failed for all hosts.
The hypothesis is that the system_auth table is insufficiently replicated and therefore the aqs user could not log into the other servers
system_auth | True | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'}