Page MenuHomePhabricator

Citoid should load citation data from local authorities sources
Open, Needs TriagePublic

Description

The automatically generated citations by Citoid (searched e.g. by ISBN/ISSN/URL) contain multiple errors and mistakes these days. In multiple languages, like Czech and Japanese, the loaded information (like title, author, publisher, etc.) contains an English transcription instead of the local script or worse: ? instead of non-english characters.

Errors like these look like they are loaded from some non-local database/source. WorldCat is used for ISBN, it would make sense to prefer result from local authorities for local books ISBNs. Probably EBSCO database could provide a disambiguation for Citoid engine.

URL searches should use better algorithm for generating an entry from the website content and metadata too.

Event Timeline

There is a green light from Zotero devs and the Polish National Library is now available:
https://github.com/zotero/translators/pull/3036

Quick steps for creating new Zotero sources (for other National Libraries):

  1. Fork repo: https://github.com/zotero/translators/
  2. Download Zotero with built-in IDE: https://www.zotero.org/download/
  3. Open Scafolld IDE:
    1. Open Zotero.
    2. menu Tools -> Developers -> Translator Editor (or something like, depends on language version).
  4. After that you need to open your working folder: menu File → Set translations directory (and choose directory you forked from github).
  5. Create a translation via this IDE.
  6. Add detectSearch and doSearch.
  7. Create and run test (the testCases output variable will be created for you, you don't have to craft this by hand).

The detectSearch function is something like this:

function detectSearch(item) {
	// for now only using ISBN
	if (typeof item.ISBN === 'string') {
		// filter by country code (83)
		const isbn = item.ISBN.replace(/[ -]/g, '');
		return isbn.search(/^(97[8-9]83|83)/) === 0;
	}
	return false;
}

You only ever need to change the number 83 in 3 places above. The number is a country code which you can check on a random, recent book from a given country. Or on wiki: https://en.wikipedia.org/wiki/List_of_ISBN_registration_groups

The doSearch function is a bit more complex, but Zotero has built-in MARC XML parser, so that might help. Most big libraries should have an API that returns MARC XML (one of standard forms of MARC format).
You can find an example doSearch function here:
https://github.com/zotero/translators/pull/3036/files#diff-b6354adf6dafc988bbced332440045a0dc9684c390e595ce68b371775a4f70d4R69

Note that ZU.doGet is basically just an AJAX call. You can test this in a browser in DevTools by replacing ZU.doGet with a fetch function.

PS: You can use ES5 (older JavaScript standard) or ES6+ (const/let variables etc).