Page MenuHomePhabricator

Deal with 'anchored' labels
Closed, ResolvedPublic

Description

Continued from T104707#1444155

Gadget slurpInterwiki.js can add labels together with sitelinks when requested but it could also add labels with anchors (#) when such interwiki links were present in article.

As such labels still exist, I suggest to remove them all and/or find a replacement (eg. from present aliases etc.).

Event Timeline

matej_suchanek assigned this task to Ladsgroup.
matej_suchanek raised the priority of this task from to Needs Triage.
matej_suchanek updated the task description. (Show Details)
matej_suchanek added a project: Wikidata.
matej_suchanek set Security to None.

It seems about 1000 items does have anchored labels incorrectly. But fixing it is very tricky and can't be done be bots (e.g. if we remove anything after # it makes a mess and leaves the item in worse situation than it was before)

So I created list of them of put it in google sheet so you can fix them:
https://docs.google.com/spreadsheets/d/16kKXKUcfSCSwIXT9jkOrIleTStX2e13pebHQhqYaHtI

So I close this bug as resolved but if you have an idea on fixing labels automatically I would be more than happy to reopen this bug
Best

Thanks for the list. I think that we could reduce the number of labels by excluding those with space before or after '#'. These are certainly false positives.

First idea: strip the first part of the label (before #) if it begins with "List of ", "Seznam ", "Liste der ", "Lijst van ", "Lista över ", "Anexo:" etc.

There are 118 labels like that and we have several false positives like "List of number-one singles in Australia during the 1940s#1940". I'm in favor of fixing them manually or semi-automatically.