Page MenuHomePhabricator

AbuseFilter API "aflprop=details" should allow querying finer grained variables
Open, In Progress, Needs TriagePublicFeature

Description

This makes it easier to analyze past hits like this without returning old wikitext and new wikitext which makes queries less efficient.

Related Objects

Event Timeline

It might be useful to provide some examples of what you mean by "querying finer grained variables" 🙂

TheresNoTime changed the subtype of this task from "Task" to "Feature Request".Jul 16 2022, 4:28 PM

So for the software that analyzes past hits, I only needed specific variables (other fields are ignored when parsing the json response) It would be nice if there is some sort of afldetailsprop that allows selecting specific variables such as {add,remov}ed_lines to reduce bandwidth usage.

So, something like the iiextmetadatafilter parameter for prop=imageinfo? I agree this would be useful. A typical filter hit is around 100 KB, and some filters on enwiki get tens of thousands of hits in a year. It would be nice to find out which parts of the filter are actually matching anything, without downloading gigabytes of data.

Supporting brotli compression (T137979), at least for API, might also help here.

+1 for this. Working with the API to inspect past hits, I often encounter the following warning, even with very low limits (less than 50):

This result was truncated because it would otherwise be larger than the limit of 12,582,912 bytes.

Because of this, I have to implement in my script a test for this warning, then when required, rerun the query with an even lower limit.

And it makes checking thousands of hits quite slow, whereas I could leverage the limit of 500 (and even 5,000 as a sysop).

Change 915406 had a related patch set uploaded (by Matěj Suchánek; author: Matěj Suchánek):

[mediawiki/extensions/AbuseFilter@master] Support vars selection via list=abuselog

https://gerrit.wikimedia.org/r/915406

matej_suchanek changed the task status from Open to In Progress.May 6 2023, 10:20 AM
matej_suchanek claimed this task.

Alternatively, could also a "negative mode" be implemented?

Namely, the variables that are responsible for the huge API responses are old_wikitext and new_wikitext. Therefore, a negative mode would be an additional user-friendly way of drastically reducing the size of the API responses.