Keeping this brief on purpose, because WP:BEANS, but basically we should write a query that tells us:
- of all webrequests to our EventLogging endpoints
- how many are from hostnames that look like IP addresses
- how many are from hostnames that match those on the sitematrix
- how many "others" are there
That last one is the interesting one, if it's unexpectedly high, we can dig deeper to see if any of those validate. We can also dig deeper in the IP-looking ones to see if the User Agent is one of our apps.
Once quantified we should remove this data en eventlogging probably at refine time (with a filtered function?)
Putting this on kanban to get it done by q4.