Page MenuHomePhabricator

Add CW#557: Detect missing space before internal link
Closed, ResolvedPublic

Description

I noticed that on some articles when something like goalkeeper[[1910 Copa del Rey]] the program does not suggest any spacing so it would appear like "goalkeeper 1910 Copa del Rey". Is there a way to have the program do this, or a code or something I can use to do this with?

@NicoV responded with “ Hi Bakertheacre. Currently, there's nothing for such cases. One way would be to add a dedicated algorithm to detect text before an internal link. If you're interested, please open a Phabricator task. --NicoV (Talk on frwiki) 1:36 am, Today (UTC−7)”

Event Timeline

NicoV renamed this task from Spacing to Add CW#5xx: Detect missing space before internal link.Jul 24 2020, 3:21 PM
NicoV updated the task description. (Show Details)
NicoV moved this task from Backlog to CheckWiki on the WPCleaner board.
NicoV renamed this task from Add CW#5xx: Detect missing space before internal link to Add CW#557: Detect missing space before internal link.Jul 24 2020, 6:13 PM

@Bakertheacre It will be implemented as error #557.

Currently, it's generating too many false positives to activate it by default for everyone, so if you want to test it, you need to activate it only for yourself: you need to configure it in User:Bakertheacre/WikiCleanerConfiguration. You have an example at User:WikiCleanerBot/WikiCleanerConfiguration, it's the line error_557_bot_enwiki=true END.

The version with this new error is not released yet. Once it's released, you can try it to see what false positives you get.

Hi @NicoV I did what you said and I copied the code over to my page. When I run WPCleaner, I don't see any of the numbers listed. Is there something special I need to do so it comes up? Thanks.

@Bakertheacre
For these errors, there are no lists, except for the ones I've generated (see the subpages of WPC all). You only see them when you edit a page with such an error: try creating an error in a page to see if it's detected

Hi @NicoV I did a test on a mainspace article and it found. Here is the link to the image file{F31947955}. Looks like it would be a manual edit at this point instead of a suggestion on right click to add the whitespace.

Many false positives, need to find ways to ignore them: list of possible prefixes, #REDIRECT...

  • Alkali metal: n[[ohm|Ω]], Li[[methyl group|Me]], Li[[fluorine|F]], Li[[hydroxide|OH]]
  • Arsenic: As[[sulfur|S]]
  • American Sign Language: thesm[[chereme]]
  • Amino acid: anti[[Fouling|scaling]]
  • Antimatter: anti[[baryon]]
  • Air Pollution: #REDIRECT[[Air pollution]]
  • Absolute magnitude: milli[[minute of arc|arcseconds]]
  • Anomalous operation: #REDIRECT[[parapsychology]]
  • List of Anglo-Saxon monarchs and kingdoms: #REDIRECT[[Heptarchy#List_of_Anglo-Saxon_kingdoms]]
  • Analgesic: Di[[nicotinic acid]]
  • Accelerated Graphics Port: i[[440LX]]
  • Antibody: k[[Dalton (unit)|Da]]

@Bakertheacre
I've modified WPCleaner to use a configurable list of allowed prefixes.
I've done an initial configuration, feel free to add other prefixes.

I'm currently testing automatic fixing of some cases on frwiki :

  • ab[[Abcde|cde]] replaced by [[abcde]]
  • xx[[yyyy| zzzz]] replaced by xx [[yyyy|zzzz]]
  • xx[[ yyyy]] replaced by xx [[yyyy]]
  • xx[[yyyy|'zzzz]] replaced by xx'[[yyyy|zzzz]]
NicoV moved this task from CheckWiki to Done on the WPCleaner board.