Page MenuHomePhabricator

Orphan articles as reading recommendations
Open, Needs TriagePublic

Description

In T341846, we built a proof-of-concept for surfacing orphan articles as recommendation to readers to increase their visibility. While this works well for some articles (example 1), the current approach times out if the article exists in many languages (example 2) due to the large number of necessary API calls to find the recommendations. This means that the current implementation is not ready to be used in practice. However, from discussions with the Web Team around annual planning indicates that the recommendations of orphans for readers would be very relevant; highlighting the need to improve the current prototype.

Therefore, in this task we want to find a better way of generating the recommendations, specifically optimizing the time needed to serve the recommendations, in order to make them more useful in practice.

Event Timeline

weekly update:

  • spent some time to try to figure out whats the bottleneck in the current setup
  • starting to brainstorm different options for improvement such as i) pre-computing look-up tables, ii) narrow down candidates earlier in the pipeline, iii) using embeddings to take advantage of fast approximate nearest neighbor lookup, etc....

weekly update:

  • no update this week

weekly update:

  • no update this week because I didnt manage to free up time for this task (shorter week, annual planning deadlines, wiki workshop submission deadline)

weekly update:

  • investigated how we could take advantage of existing recommendations from the RelatedArticles feature. the feature obtains recommendations from queries to cirrussearch' morelike. There seem to be two interesting options from the options available in the API
    • filtering or boosting with the number of inlinks (i.e. prioritizing articles with low indegree), e.g., via Srsort:incoming_links_asc
    • using the m̀orelikethis`option which allows to specify specific templates. In our case, we could then require recommendations to have the Orphans-template. The query then yields related articles that are marked as orphans via the corresponding template. Example query: https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=morelikethis:Tiwanaku%20hastemplate:%22orphan%22&srlimit=10
    • especially, the latter approach seems promising (as it could also be adapted to other languages) but would require some evaluation of the quality of the recommendations. while the recommendations are orphans, it is not clear how "well" they are related to the query-article. however, in contrast to link translation, the queries via morelike/morelikethis are much faster.

weekly update:

  • no update since my main focus was on Wiki Workshop T352543

weekly update:

  • no update. main focus was preparing for attending ICWSM T362416