Page MenuHomePhabricator

Newcomer tasks: exclude maintenance templates that are too old
Open, Needs TriagePublic

Description

This is an idea that came from our team's offsite that sounds simple, but is technically challenging enough that it needs its own Phabricator task.

In attempting to do edits suggested by maintenance templates, our team found that it is more difficult to work with maintenance templates that have been applied a long time ago (like years ago), because it's common that the article has changed a lot since the template was applied, but the template was not removed.

So the idea is to exclude templates of a certain age from suggested edits.

Event Timeline

This would probably require us to push custom data to ElasticSearch, which is a capability required by a number of different features (e.g. ORES based topic filtering will probably require it too). Might be worth its own task.

Although actually this is actually a bit more complicated as it would require us to not overwrite existing data (when a maintenance template is added, push the date to ES; on subsequent edits, don't delete it).

Would it help if the templates are changed to have something more structured, to identify the timestamp? Like a CSS class, which is a transparent operation? <span class="timestamp">XX</span>
It has been done for the mobile view, to allow templates to be displayed properly by the skin, no matter the language or the local style.

Do they have the timestamp in the first place? I'd imagine most don't.

Do they have the timestamp in the first place? I'd imagine most don't.

Not directly, all templates has been inserted sometime. Yes, that's next to impossible to get real-time... However, Wikiblame's (http://wikipedia.ramselehof.de/wikiblame.php) algorithm seems to be reasonably fast, might be worth a look

Wikiblame does binary search on the page history, which means it's not guaranteed to find the latest instance of the template being inserted. (Sometimes that's a good thing; maintenance templates sometimes get removed by offended authors, who then get reverted. Contentious articles, which often have maintenance templates, get page blanking vandalism occasionally. In these cases, it would be better not to reset the template date. But also, the page might get the same template multiple times for unrelated reasons, and in that case we don't care about the date of the earlier one.) It can use linear search, but that will be slow on large pages (even binary can be, you need to fetch the full list of revisions first).