Page MenuHomePhabricator

Add search filter for time since last edit
Closed, ResolvedPublic

Description

Background

Newcomers can find Suggested Edits on their Special:Homepage to get them started with editing Wikipedia. Ideally, we do not want to direct newcomers into ongoing edit-wars. So for our existing structured tasks, like Add a Link, we have systems in place to ensure we do not surface pages that have been edited in the last 24 hours.

Growth Team is now working together with Machine Learning team and Editing team on a new structured task which supports the newcomer in revising promotional tone in articles.

The ML Team and Data Persistence team are working towards creating a pipeline that enables this new structured task at scale (T401021). However, that pipeline would become much more complex if we were to introduce the requirement that the page must have not been edited in the last 24 hours.

So to that end, it would be great if we could make this a filter in the search request to cirrus search that surfaces candidate pages in the first place (Leveraging weighted tags that are being added in above pipeline. Akin to hasrecommendation:link).

If that is possible to do, then this would unlock several additional simplifications in GrowthExperiments and be used in more circumstances than only this one use-case.

Acceptance criteria:

  • CirrusSearch supports another filter like age:1d that allows to limit results to those whose latest revision is older than 24 hours.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Michael sorry I completely missed you created this ticket before mine, should I merge T403593 into this one or would you prefer to keep one to track the work on your side?

@Michael sorry I completely missed you created this ticket before mine, should I merge T403593 into this one or would you prefer to keep one to track the work on your side?

I don't have strong feelings either way. I think the current granularity of these tasks lends itself to keeping each open separately. Mine describes more high-level the need and why we need it, yours is more detailed and implementation and solution-focused. I think they can be both open in their own right.
But I don't mind. And if at some point you want to take mine over as an epic or hypothesis task, then that's also fine with me!

It's not super clear if Search Platform needs to do this work, and what the priority and timeline are. We'll review this in a week if we have more input from Growth team.

@Gehel - The Growth and Machine Learning teams hope the Search Platform team can take ownership of this work, if possible.

From our perspective, this is a high-priority task, as the underlying logic is required before the Revise Tone Structured Task can be released to production (tracked under WE1.1 in the current annual plan). While alternative implementations are possible, the options we explored would add significant complexity to the related data pipeline.

Ideally, the subtask T403593: CirrusSearch should allow filtering on page creation and last edit timestamps can be completed by the end of October 2025.

EBernhardson subscribed.

The T403593 subtask is now deployed to production and ready for use. The feature is documented in Help:CirrusSearch.

The T403593 subtask is now deployed to production and ready for use. The feature is documented in Help:CirrusSearch.

Thank you for the fast work here! I'm looking forward to using this in the Revise Tone work and other places 😊