- Mentioned In
- T160273: Create Zotero translator-writing priority list for each language Wikipedia
T148320: Documenting process of writing Zotero translators through translation-servers
T115158: Write a Zotero translator and document process for creating new Zotero translator and getting it live in production
We get good results for news articles that are US based because we're using Zotero, and they have better covered in English than they do in Italian.
You can see a list of all the news outlets and publishers covered here:
When Zotero doesn't have a translator, we have a fall-back web scraper that doesn't do as good a job at recognizing whether something is a newspaper or website. Almost everything right now will be declared a website. We have a bunch of tickets for improving this so hopefully it will get better with time- however, even if we get back good metadata from a news site, it is very hard to tell whether a website is a news institution or just a blog based only from metadata- most metadata will just call either an "article".
Our default is to call everything a website, because we can really be guaranteed that it is... it just also happens to be a news article :).
If you would like to file tickets in https://github.com/zotero/translators asking for translators specifically for each of these major italian newspapers, you can post the tickets here as well.
Not really, https://github.com/zotero/translators/issues/new and just type what you want.
Unfortunately it looks like there's quite a backlog though...
File a bug about unrecognized quotes though (here, not in zotero). It looks like some encoding issues with the « quote mark. Or we could just convert that thread to that issue :).
@Elitre, Zotero is a separate open source project, and no WMF members have +2 on the translator repo. It would probably be weird/against some rules somewhere for one organisation to organise contributions to another organisation- I think the proper way to do this would probably be to encourage Zotero to join such initiatives and maybe offer to co-mentor?
We do have our own translator repo, so we could technically just encourage contributions to that one, but I think it's preferable if these changes go upstream, particularly new translators, which would be useful to the project at large.
@Elitre, looking at https://www.zotero.org/support/dev/translators#metadata , a possible Google Code-in task could be i.e. to write five translators from a list that someone at Wikimedia would maintain. This should be done after discussing this collaboration with Zotero. The next GCi edition is expected to start in November, so you have time to plan. :)
FWIW: the Swedish Wikipedia has a list of "most frequent domains" at https://sv.wikipedia.org/wiki/Anv%C3%A4ndare:Edgars2007/Most_frequent_domains (https://phabricator.wikimedia.org/P691 is the source). It doesn't specify which ones are news sites though.
News sites extracted from that link, in order of use:
www.dn.se www.svd.se sverigesradio.se www.aftonbladet.se news.bbc.co.uk arkivet.dn.se www.expressen.se www.svt.se www.bbc.co.uk www.sr.se (Also a radio-station, so some my be news, some not) www.nytimes.com www.sydsvenskan.se www.gp.se www.tv4.se (Also a TV-channel, so some my be news, some not) www.telegraph.co.uk wwwc.aftonbladet.se www.theguardian.com www.bbc.com svenska.yle.fi www.dailymail.co.uk news.google.com www.reuters.com www.dagensmedia.se query.nytimes.com www.huffingtonpost.com www.corren.se www.skanskan.se www.di.se
I did a test of the five most cited news sites on sv.wp here: https://sv.wikipedia.org/w/index.php?https://sv.wikipedia.org/w/index.php?title=Anv%C3%A4ndare:Josve05a/Citoid&oldid=36543513 and I did a manual fill of the links as well to compare what te difference was.