Page MenuHomePhabricator

AutoEdits: Wrong classification: "Generic rollback" is shown instead of "Huggle" for Huggle 3 contributions
Open, Stalled, Needs TriagePublic5 Estimated Story Points

Event Timeline

This has been partially fixed. The issue is there is overlap with tags on some semi-automated edits. For instance, Twinkle reverts are also tagged as Undo by MediaWiki, so it's difficult for us to show one but not the other. The same is true with Rollback. I managed to filter out the Huggle edits (which have their own tag, "Huggle"), but STiki edits do not have their own tag, and hence they aren't filtered out from "Generic rollback" and "Undo".

So, improvement at least... But we still need to do more brainstorming on how to fix this for all semi-automated tools in an efficient and sane way.

:D

Thank you very much. My naively straightforward through-the-wall approach here would be adding tags for STiki, Twinkle, Huggle, whichever are not yet being tagged by MediaWiki on en.wikipedia.org. Lack of tags is a problem? Then let's add tags! :)

That still won't fix the problem for other scripts, but you're right that STiki should tag, especially given its popularity. I've proposed this at WT:STiki

MusikAnimal moved this task from Working to General / other on the XTools board.
MusikAnimal subscribed.

I probably won't be working on this for a while, moving back to Ready for the time being.

Thank you very much for the detailed explanation and the request at WT:STiki :)

1234qwer1234qwer4 changed the task status from Open to Stalled.EditedSep 8 2020, 8:41 PM
1234qwer1234qwer4 subscribed.

Another update; the STiki tag is now excluded from both undo and rollback.

The only possible way seems to be a regex_excludes parameter similarly to the aforementioned tag_excludes, that would ignore both the edits found through the regex parameter and the ones found through tag but satisfying the given regex. I've already mentioned this at T262147; I suggest a separate task be created for that.

Well, seven years later: the reported issue in autoedits-contributions is fixed for tagged edits by my patches for T382773: "tag_excludes" in AutoEdits configuration does not behave as expected (as in, the huggle 3 edits are not shown in ?tool=Generic rollback). Problem still persists for older edits (eg if you look at the table).

On regex_excludes: It could help. If we wanted to do that, IMO it would rather be a unification under tool_excludes to avoid regex redundancy. And again to avoid redundancy, if we did that I think we also ought to restructure; as in get all the data and do unified processing in one go (as we'd be using the results of "does edit X match tool Y" in other tools tests).
Possible workflow:

  1. get all DB info we want on all edits
  2. inject tags and summary regexes to add to each edit a matched_tools or similar column
  3. inject tool_excludes logic to change these matched_tools (well, replace by a new one)
  4. also select counts of edits for each tool and of edits with one/no tools. Perhaps also add a has_tools column or similar to preprocess the "does this edit have at least one tool?" checks

I don't know, though, if our current practice of doing the filtering in the SQL is faster than PHP post-processing. If it is, then the way to go would I suppose be a four-level nested query. This would leave us zero PHP post-processing, I believe. The autoedits-contributions and non-automated list would use the basic edit data we would keep in all layers, along with matched_tools; the table would use the COUNTs of edits per tool; the pie chart would use the has_tools if we use that, or just smth like $count($edit['matched_tools']) > 0. Else, we could just get the DB info in SQL and do everything else in PHP.

That's a draft plan. On time: I believe we're already applying all regexes at least once to every tool, so this shouldn't slow us down much. Depending on how it all combines, we might even gain time; the current UNION bunch for the table makes many separate queries on the tables, notably we prob get check the tags of an individual edit 30ish times; though again it's possibly faster that way. Only one way to find out.

(I don't know if this should necessarily be in a separate task, as it's the fix for issues like this one, precisely.)