Rewrite Zotero translator to treat imdb as a webpage, not a movie.
Open, LowPublic

Description

Citing http://www.imdb.com/title/tt0407650/ gives |date=6 Apr 1983, this is wrong. The film was realease then, not the website. So citoid is not yet compatible with citing IMDb-links.

Josve05a created this task.May 9 2015, 2:33 PM
Josve05a updated the task description. (Show Details)
Josve05a raised the priority of this task from to Needs Triage.
Josve05a added projects: Citoid, VisualEditor.
Josve05a added a subscriber: Josve05a.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 9 2015, 2:33 PM
Mvolz moved this task from Backlog to Site specific issues on the Citoid board.May 9 2015, 4:03 PM
Mvolz added a subscriber: Mvolz.May 9 2015, 4:08 PM

Diff? I thought the desired results of citing imdb would be to cite the video, not the website.

I thought the desired results of citing imdb would be to cite the video, not the website.

99 out of 100 times when people use IMDb to cite something, they are reffering to the website, not the video.

I thought the desired results of citing imdb would be to cite the video, not the website.

99 out of 100 times when people use IMDb to cite something, they are reffering to the website, not the video.

And even if they are citing the video they should use cite A&V, and not cite web.

So, with citoid, the template is automatically selected. With imdb, the citoid type is indeed video, but this gets cast to cite web depending on how the wiki is configured. This bit is easily fixed by changing the config message.

The problem is that the metadata that is read from imdb is item type video, for virtually every type of metadata on the page. So it will be difficult to remedy in a reasonable manner since the site is telling us it is itemtype video.

Mvolz set Security to None.

Probably the best solution is to 1) add cite av to the config message, and then 2) allow the user to change template if they really meant cite web, which is another phab task: T97936

Unfortunately you'll the date field will still be "wrong" as this won't change the underlying data we have.

How about not adding date fields for websites earlier than say 1990 or something? When was the first website created?

That's something that template data can't handle. It can pretty much just match fields to fields and that's it.

The only possible solution I can think of is for that kind of thing to be done in the templates themselves.

When referencing a book, we tend to use ISBN and similar identifiers. Those may come from a pasted link to a page where the book can be read, or where meta data resides from the entity through which the paper was published.

The meta data about books may be incorrect on third-party websites, but the book itself is the same (exceptions there..). In case of citation, it is likely that one is citing the information inside the book. Something that can be verified without original research, the fact or statement resides in the content of that book or paper.

Books and films can both be either fiction and non-function. However when referencing IMDb it is probably more likely that one is referencing factual information about the film, not a fact or statement from within the film.

When stating that a particular actor plays a role in a movie, the movie itself cannot be the source. For it would require research to identify that person. And of course things can be digitally constructed (which may or may not be disclosed). The textual credits before/after a film could be used as reference (e.g. a first party source), but I'll ignore that for now.

I don't know if there's a defacto standard source on the web for citing documentaries, news broadcasts etc. but IMDb most certainly is not that. It is not an official platform for publishing the films (or their meta data). It's more an encyclopedia or blog, not a digital representation of the subject.

Perhaps it makes sense to change indexing of IMDb to unconditionally result in a "website reference" about the page, not the content described on that page. E.g. treat it more like how we would index a blog about a film, and other wikis that describe films. If IMDb does not have a "last update" field, we'll have to leave it out. It's not uncommon for website not to have such field.

Mvolz added a comment.Jun 25 2015, 2:01 AM

I agree that citing AV media from content within a website about them is relatively unusual on a wiki. However, the implementation of this is a bit funny.

If we want to make special cases for particular websites, then the obvious way to do this is to write our own translator for imdb. We already have our own translator fork from zotero's upstream so it is not at all a stretch. This would fix all citations of imdb as websites, not video.

Unfortunately using citoid alone, citoid doesn't know anything about particular websites. It just reads metadata. And unfortunately imdb's metadata is telling us it is a Video.tv_show. So unless we want to make citoid just ignore all websites purporting to be any kind of video (which I think sets a dangerous precedent) we should not do that. Note that we sometimes have the opposite problem as well; for instance, barnes & noble tells us its book pages are type "product" which isn't even a valid og type, when probably a user pasting that link in wants to cite a book.

The other option, which I've considered in the past, is for the api to allow the user to request a particular type as output. If the type mismatches the og type, for example, the request could ignore all type related og information. It might be particularly useful for schema.org data, for instance, which allows multiple entities to be in a page.

But we'd have to restructure the current UI- one, could allow users to pick a citoid type from a drop-down or something- pretty cluttered and imo a bad idea. The other possibility which I like better is that if we do end up doing T97936 we could re-request the data, use ANOTHER typemap that is just written in the reverse to figure out what citoid type the user means when they request that type (we can't use the original Mediawiki:citoid-template-type-map.json because there are often multiple keys with the same value) and then recreate the citation.

There's a lot of things to do before we'd add a feature like that though since it requires a lot, both front end and back; I'd want to do T97936 first and then there's a lot of things we need to do to improve citoid's metadata retrieval before new api features becomes a priority.

So to sum up, the most expedient thing to do is write our own imdb translator for the WMF translator fork. The long term, more nuanced solution is to allow users to change the default type and then re-request the citation asking for a particular type.

czar added a subscriber: czar.Aug 12 2015, 1:59 AM

Didn't see it brought up so I thought I'd mention that IMDB, as a user-contributed site, should largely not be used as a source (at least so established by enwp). It would save us a lot of trouble of removing the link later to tell the user this when they attempt to use it as a source.

Mvolz renamed this task from Citing http://www.imdb.com/title/tt0407650/ gives |date=6 Apr 1983 to Rewrite Zotero translator to treat imdb is a webpage, not a movie. .Aug 12 2015, 9:05 AM
Mvolz triaged this task as Low priority.
Mvolz added a comment.Aug 12 2015, 9:08 AM

Didn't see it brought up so I thought I'd mention that IMDB, as a user-contributed site, should largely not be used as a source (at least so established by enwp). It would save us a lot of trouble of removing the link later to tell the user this when they attempt to use it as a source.

This is something that should be done in the citation template maybe? The extension and back-end are for all wikis, so it doesn't make sense to encode wiki-specific guidelines there. Or maybe in a gadget? (Of course the gadget would have to be enabled site-wide to catch the users who don't know about that rule)

Mvolz renamed this task from Rewrite Zotero translator to treat imdb is a webpage, not a movie. to Rewrite Zotero translator to treat imdb as a webpage, not a movie. .Aug 12 2015, 9:23 AM
czar added a comment.Aug 13 2015, 12:18 AM

Didn't see it brought up so I thought I'd mention that IMDB, as a user-contributed site, should largely not be used as a source (at least so established by enwp). It would save us a lot of trouble of removing the link later to tell the user this when they attempt to use it as a source.

This is something that should be done in the citation template maybe? The extension and back-end are for all wikis, so it doesn't make sense to encode wiki-specific guidelines there. Or maybe in a gadget? (Of course the gadget would have to be enabled site-wide to catch the users who don't know about that rule)

Should I propose it as a feature req somewhere else? I think it would make sense for Citoid to have some interaction with each wiki's actual content policies

What is there to do with TemplateData here?

czar added a comment.Oct 28 2016, 5:53 PM

Another thought on this: Zotero is primarily a citation manager, so when Zotero sees an IMDB or Google Books page, it thinks you want the citation for the film or book listed—not that of the webpage. In general, I think this is the right mindset for Wikipedia too—we should not be citing IMDB or Google Books pages, but instead directly to the medium if it's being used as a primary source. (Plus see what I wrote about about using IMDB as a source.) In this case, I'm inclined to call this feature "not-a-bug"