Page MenuHomePhabricator

[SPIKE] Research which sections and articles on Wikipedia should not have topics generated from bluelinks
Open, Needs TriagePublic

Description

As part of work on section topics (Epic https://phabricator.wikimedia.org/T275882), we are excluding certain sections from having topics as per https://phabricator.wikimedia.org/T318092 and https://phabricator.wikimedia.org/T279519#7368694. We are basing these requirements on similar requirements done for the 'Add a link' newcomer task https://phabricator.wikimedia.org/T279519

The goal of this spike is to research which other sections and articles in general should not have section topics, capture those observations and decide whether and which to implement.

Usage Note:
Note that section topics will be used for section level image suggestions and certain sections are to be excluded to have images suggestion to them as per https://phabricator.wikimedia.org/T311730. The sections excluded from having topics is a subset of section excluded of having images recommended.

Note
Growth team has done work on which types of articles to exlcude for structured tasks

Event Timeline

AUgolnikova-WMF renamed this task from [SPIKE] Research which sections of articles on Wikipedia should not have topics generated from bluelinks to [SPIKE] Research which sections and articles on Wikipedia should not have topics generated from bluelinks.Oct 17 2022, 2:11 PM
AUgolnikova-WMF updated the task description. (Show Details)

As a side note, we should filter category links in the form of Category:Something. These are attached to the last section, although we don't want to skip it, see update in T318092: [M] Exclude certain sections from having topics in the section topics pipeline's description.

@mfossati @AUgolnikova-WMF is this ticket still valid, or can it be closed? I think we have more specific tickets on this topic now.

@CBogen This task is to research which sections and articles should not have section topics more generally and long term beyond the exclusions we have identified for our initial implementation, as per per our discussion with Marco. I would suggest leaving it in the backlog to keep in mind that we want to research this more broadly.