Abusefilter can make editing wikibases with large entities slow, see T205252.
Some investigation along with the community was done in T205254 and it was decided that statement GUIDs do not need to appear in abusefilter output.
Statement guids currently use the "id" key in JSON output which is currently how abuse filter text is generated and filtered.
This "id" field is very generic and will also match a bunch of other stuff in the JSON that we are not sure about.
Thus something needs to change regarding how item and property abusefilter text is collected and or filtered.
- Rather than starting with everything and then filtering down, we selectively a pick out things to add to the text? (might end up removing more than we intend? that might not be a bad thing? Just add back what the community want?)
- Improve our method of filtering, possibly allowing fileting different keys and different levels / down different paths in the JSON tree?
Whatever filtering and collection of values occurs, it must be very performant.
We have had speed issues in this bit of code before, (previously fixed up in T204109)
This is technically a breaking change, so this should be announced.
- Add id key to ignored keys for filters in item, property and lexeme content
- Check if we can as well drop hash key
- Announce the breaking change
- check if BC announcement and buffer time are needed (per comment https://phabricator.wikimedia.org/T226216#5379111)
- Deploy/Merge the change
An example of lines that will be removed: