Page MenuHomePhabricator

Some top websites do not get publisher information when creating cites with citoid
Closed, DuplicatePublic

Description

IMDB, YouTube, and BBC News are all top-cited sites (per older data from here). Feeding the following sample URLs from those sites generates citations with missing works/publications/website title data:

http://www.imdb.com/title/tt0100403/?ref_=nv_sr_3
https://www.youtube.com/watch?v=CGcWTIWYDMQ&feature=youtube_gdata_player
http://www.bbc.com/news/uk-england-tyne-31552307

e.g., that BBC link generates:
{{Cite web|title = 'Five-legged sheep' Quinto gives birth to twins in Morpeth|url = http://www.bbc.com/news/uk-england-tyne-31552307}}

should probably have a publisher=BBC field. (BBC may be different from the others, in that it should probably go to cite news with BBC as newspaper rather than publisher, and in fact this is what happens for at least some news.bbc.co.uk URLs.)

Event Timeline

LuisVilla raised the priority of this task from to Needs Triage.
LuisVilla updated the task description. (Show Details)
LuisVilla added a subscriber: LuisVilla.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 12 2015, 12:22 AM
LuisVilla set Security to None.
Mvolz added a subscriber: Mvolz.Apr 12 2015, 6:25 AM

This is due to limitations of type casting. We're currently calling open Graph type "article" type "blogPost" which has the type "blogTitle" for publisher field. Depending on the wiki, some cast this type to Article (fr wiki) and some to website ( it wiki).

Templates don't recognize this field and thus don't use it.

There are some solutions to this; we might go back to calling these "websites"- but it's not obvious what direction we should go in here.