Page MenuHomePhabricator

WE1.11.3 (WE1.3.5) Article similarity model
Open, Needs TriagePublic

Description

As part of the Recommender System for patrollers T398071, on this hypothesis we are testing how to create more comprehensive recommendations.

Hypothesis: If we build an article similarity model we can provide better personalized recommendations to editors based on their topics of interest.

Event Timeline

Progress:

  • This project was started this week
  • We are researching the potential to reuse intermediate outputs from the Language-agnostic Link-based Article Topic Model. By representing article topics as vectors, we can perform similarity measurements while building on a robust, proven pipeline, ultimately saving a significant amount of computational resources.

Any updates on metrics related to this hypothesis?

  • No

Any emerging blockers or risks

  • No

Any unresolved dependencies?

  • No

Have there been any new lessons from the hypothesis?

  • Not yet

Have there been any changes to the hypothesis scope or timeline?

  • No

Progress (Second week)

  • Currently we are testing two options for finding similar articles:
    • Option 1 (Outlink Topic Model) : With the help of @fkaelin we have been exploring the technical feasibility of creating an article similarity model based on the Language-agnostic Link-based Article Topic Model . Our approach is—for each article existing in Wikipedia—to pre-compute a set of similar articles (e.g., 10 similar articles). First results are promising in terms of computing time, suggesting that it would be possible to create a monthly set of recommendations.
    • Option 2: Search API (CirrusSearch) In parallel, I'm researching other ways to find similar articles. Specifically, I'm exploring the usage of the "morelike" feature on the search API (CirrusSearch). So far, the main advantage of this approach is that it doesn't require new computations and is based on a solid and well-established service.

Any updates on metrics related to this hypothesis?

  • Not yet.

Any emerging blockers or risks?

  • No.

Any unresolved dependencies?

  • No.

Have there been any new lessons from the hypothesis?

  • Not yet.

Have there been any changes to the hypothesis scope or timeline?

  • No.

Next Steps

  • First we need to run a larger test form method (i). Our current experiments has been in medium-size samples, and we need to test how this scale for the full dataset.
  • After that, we are going to design an off-line evaluation framework to compare the two proposed methods. This work should take around 2-3 weeks.

Progress

  • Technical feasibility: Confirmed the technical feasibility of generating "Top 10" similar articles for the top 47 Wikipedias using the Language-agnostic Link-based model. The code developed by @fkaelin, shows that would be possible to monthly update this data.
  • Baseline definition: Established the List Building tool and CirrusSearch "morelike" feature as our two primary baselines for performance comparison. We are going to use Jaccard Index (set overlap) and Kendall-Tau (ranking similiarty) to understand the differences across these three models. 

Any updates on metrics related to this hypothesis?

  • No

Any emerging blockers or risks

  • No

Any unresolved dependencies?

  • No

Have there been any new lessons from the hypothesis?

  • Using a Spark+Python framework, we can efficiently run "Nearest Neighbors Search (NNS)" for all Wikipedia articles. NNS is a key task for most of the personalization/recommender tasks. Hence, this technical capacity opens several product opportunities.

Have there been any changes to the hypothesis scope or timeline?

  • No

Summary

  • Hypothesis: If we build an article similarity model we can provide better personalized recommendations to editors based on their topics of interest.

Progress

  • We ran an offline evaluation for a recommender system (RecSys) based on users' topic interests.
  • The RecSys works as follows:
    • For the target user, we take their last 200 edits.
    • For each edit, we obtain the topic embedding for that article (a vector representing the article's topic).
    • We obtain a vector representation of the user's edit interests (the centroid of the edited articles).
    • Then, we consider new edits on that wiki and try to predict which pages will be patrolled by the user, considering their interests.
    • We compared this topic-based approach with a simple baseline of previously edited pages.
  • We found that while around 30% of reverts can be explained by previous edits, just 2% of edits are explained by topic interest. This means that editors tend to patrol the content they have previously edited, and they rarely edit new content. However, the 2% explained by topic interest is useful for predicting the patrolling of new pages; therefore, topic-based recommendations show potential to help patrollers discover new content.
  • We presented the results to the Moderation Tools team showed examples, and discussed the importance of evaluating the quality of the "new page discovery" recommendations.
  • We decided to contact the User Experience (Research) team to ask for help with running a user evaluation.
  • We deployed a demo (running on Simple Wikipedia): https://patrollers-recsys-demo.wmcloud.org/

Any updates on metrics related to this hypothesis?

  • 30% of reverts can be explained by previous edits to the same article.
  • Topic-based edits only account for 2% when looking at past data.

Any emerging blockers or risks

  • No

Any unresolved dependencies?

  • No

Have there been any new lessons from the hypothesis?

  • The offline evaluation (based on historical data) has limitations when evaluating the potential of this RecSys because we are trying to introduce a new behavior. In these scenarios, qualitative evaluations are helpful.

Have there been any changes to the hypothesis scope or timeline?

  • TBD

Next Steps:

Samwalton9-WMF renamed this task from WE1.3.5 Article similarity model to WE1.3.5 / WE1.11.3 Article similarity model.Apr 17 2026, 10:19 AM
Samwalton9-WMF renamed this task from WE1.3.5 / WE1.11.3 Article similarity model to WE1.11.3 (WE1.3.5) Article similarity model.Tue, Apr 21, 1:34 PM