[IABot] URLs with periods after them are recorded with the periods in the externallinks_global table
Closed, ResolvedPublic

Description

If a URL appears on Wikipedia with a period after it, for example...

<ref>http://www.statistics.sk/mosmis/eng/run.html.</ref>

.. it is recorded in the externallinks_global table with the period. MediaWiki ignores periods at the end of URLs, so IABot should as well.

kaldari created this task.Aug 5 2016, 2:02 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 5 2016, 2:02 AM
Cyberpower678 triaged this task as "High" priority.Aug 5 2016, 6:29 AM

This is done quite easily, but it would be nice to know which characters are ignored in a URL not contained in brackets. I've been trying to research that, but haven't had any luck.

It looks like these are the characters that are excluded from a URL if they appear at the end of it:

.,:;?!)”<>[]\

Do you have a page I can test this on?

I'll create one...

kaldari added a comment.EditedAug 9 2016, 6:21 PM

Here's a page you can test against a fresh database: https://test.wikipedia.org/wiki/Links

All the links at the top should get recorded as a single entry: https://test.wikipedia.org/wiki/Main_Page

All the links at the bottom should get their own separate entries.

Cyberpower678 closed this task as "Resolved".Aug 10 2016, 5:30 PM