Page MenuHomePhabricator

Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls
Open, NormalPublic

Description

It should make it possible to specify formatter urls more complex than currently.

There is probably already a bug about this, but in case it got forgotten since we first discussed it. Probably one for T150179

Event Timeline

Esc3300 created this task.Nov 17 2016, 10:12 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 17 2016, 10:12 AM
Stigmj added a subscriber: Stigmj.Nov 17 2016, 10:25 AM
Nikki added a subscriber: Nikki.Nov 17 2016, 10:44 AM

Can you please add a link to the sourcecode of the tool so we can evaluate what kind of operations it does?

Esc3300 updated the task description. (Show Details)Nov 17 2016, 11:49 AM

Sure added to summary above.

hoo added a subscriber: hoo.Nov 17 2016, 12:50 PM
matej_suchanek renamed this task from Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handeling for external id formatter urls to Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls.Mar 22 2017, 5:59 PM

As background, I'm seeing about 2000 "hits" per day on this service right now, with about a dozen properties linking through it to their databases.

ArthurPSmith added a comment.EditedMar 23 2017, 3:37 PM

I believe a way this could be done would be to allow the attachment of regular expressions to the formatter URL, and have the external id URL conversion code understand them. That is, if there was a qualifier property that specified "regex substitution" for example, the ISNI problem (of additional spaces within the id that must be removed for the formatter URL) would be handled by a value something like "s/\s+//g" (remove all spaces). Some of the others might need a "regex match" on the id that allows specifying a $1, $2, $3 grouping pattern, and the formatter URL then looks something like http://...../$1/$2/$3 (or that could also possibly be handled by a substitution as in the ISNI case). The IMDB case is more difficult because it's essential 4 different formatter URLs based on the first characters of the id, so it might need a "regex filter" that limits the scope of each formatter URL based on the id; wikibase would then need to look through the filter regexes to find a matching formatter URL and use that.

thiemowmde triaged this task as Normal priority.Mar 26 2017, 1:54 PM
Salgo60 added a subscriber: Salgo60.Mar 5 2019, 6:34 AM