Page MenuHomePhabricator

Restructure so that citoid can be run without Zotero
Closed, DeclinedPublic

Description

  • Make disabling of Zotero configurable (easy)
  • Add ability to use Zotero translators natively (epic)

See @ssastry's comment T93579#2156135 describing which interfaces would need to be implemented to make zotero translators work in a regular node environment.

Event Timeline

Mvolz created this task.Mar 23 2015, 12:08 PM
Mvolz raised the priority of this task from to Needs Triage.
Mvolz updated the task description. (Show Details)
Mvolz added a project: Citoid.
Mvolz moved this task to Service on the Citoid board.
Mvolz added a subscriber: Mvolz.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 23 2015, 12:08 PM
Mvolz set Security to None.
Mvolz triaged this task as Low priority.Aug 13 2015, 11:48 AM
GWicke raised the priority of this task from Low to Normal.Dec 19 2015, 10:11 PM
GWicke added a subscriber: GWicke.

Bumping priority, as zotero is a major liability for security, stability and maintainability.

Mvolz claimed this task.Dec 20 2015, 10:45 AM
Mvolz added a comment.EditedDec 20 2015, 10:48 AM

For clarity, this task was just to section it out for people who wanted to
use citoid without installing Zotero... Not to eliminate it entirely.

The metadata support is still pretty weak without the translator paradigm;
and I'm afraid such a paradigm might be a critical part of citoid's
usefulness.The biggest issue is item type determination; most metadata on
the web can't reliably distinguish between magazine articles, journal
articles, and blog posts. The little metadata that determines type that we
have (like open graph) calls these all articles.

Content mine has software (thresher) T112260 that also uses translators; and uses
node. Unfortunately it also uses a (C?) library, phantom, and again we have
dependency issues. Unlike Zotero however, their translators are designed to
be usable by different consumers so we could potentially directly support
them. Unfortunately, they have very few translators and a small community.

@Mvolz: Did we ever try to implement / port the JS libraries needed to run zotero translators to node?

GWicke added a subscriber: mobrovac.EditedJan 11 2016, 7:33 PM

@mobrovac, @ssastry and myself just brainstormed ways to remove the dependency on XULRunner in the office. We agreed that it would be useful to spend a day or so investigating the feasibility of a node-only interface for the Zotero translators soon.

We are in the process of migrating all misc node services to Jessie, and XULRunner is not available on Jessie. To avoid blocking the move to Jessie & the conversion of sca nodes, we'll need to find a way forward in the next 1-2 months.

Here are all the classes that zotero uses that xulrunner provides very likely.

Components.classes["@mozilla.org/binaryinputstream;1"]
Components.classes["@mozilla.org/binaryinputstream;1"].
Components.classes["@mozilla.org/charset-converter-manager;1"]
Components.classes["@mozilla.org/consoleservice;1"]
Components.classes["@mozilla.org/consoleservice;1"].
Components.classes["@mozilla.org/file/local;1"].
Components.classes["@mozilla.org/intl/converter-input-stream;1"]
Components.classes["@mozilla.org/intl/converter-output-stream;1"]
Components.classes['@mozilla.org/intl/nslocaleservice;1'].
Components.classes["@mozilla.org/intl/scriptableunicodeconverter"].
Components.classes["@mozilla.org/intl/stringbundle;1"]
Components.classes["@mozilla.org/io/string-input-stream;1"]
Components.classes["@mozilla.org/mime;1"]
Components.classes["@mozilla.org/network/file-input-stream;1"]
Components.classes["@mozilla.org/network/file-input-stream;1"].
Components.classes["@mozilla.org/network/file-output-stream;1"]
Components.classes["@mozilla.org/network/file-output-stream;1"].
Components.classes["@mozilla.org/network/input-stream-pump;1"]
Components.classes['@mozilla.org/network/io-service;1']
Components.classes["@mozilla.org/network/io-service;1"]
Components.classes["@mozilla.org/network/io-service;1"]
Components.classes["@mozilla.org/network/io-service;1"].
Components.classes["@mozilla.org/network/protocol;1?name=file"]
Components.classes["@mozilla.org/network/server-socket;1"]
Components.classes["@mozilla.org/observer-service;1"]
Components.classes["@mozilla.org/observer-service;1"].
Components.classes["@mozilla.org/preferences-service;1"]
Components.classes["@mozilla.org/privatebrowsing;1"]
Components.classes["@mozilla.org/privatebrowsing;1"]) {
Components.classes["@mozilla.org/process/util;1"].
Components.classes["@mozilla.org/scriptableinputstream;1"]
Components.classes["@mozilla.org/scripterror;1"]
Components.classes["@mozilla.org/scriptsecuritymanager;1"]
Components.classes["@mozilla.org/thread-manager;1"]
Components.classes["@mozilla.org/timer;1"].
Components.classes['@mozilla.org/toolkit/app-startup;1']
Components.classes["@mozilla.org/xmlextras/domparser;1"]
Components.classes["@mozilla.org/xmlextras/xmlhttprequest;1"]
Components.classes["@mozilla.org/xmlextras/xmlhttprequest;1"].createInstance();
Components.classes["@mozilla.org/xmlextras/xmlserializer;1"]
Cc["@mozilla.org/moz/jssubscript-loader;1"]

Here are the interfaces it uses:

Components.interfaces.nsIAppStartup
Components.interfaces.nsIBinaryInputStream
Components.interfaces.nsICharsetConverterManager
Components.interfaces.nsIConsoleMessage
Components.interfaces.nsIConsoleService
Components.interfaces.nsIConverterInputStream
Components.interfaces.nsIConverterOutputStream
Components.interfaces.nsIDOMParser
Components.interfaces.nsIDOMSerializer
Components.interfaces.nsIDOMXPathResult
Components.interfaces.nsIFile
Components.interfaces.nsIFileChannel
Components.interfaces.nsIFileInputStream
Components.interfaces.nsIFileOutputStream
Components.interfaces.nsIFileProtocolHandler
Components.interfaces.nsIHttpChannel
Components.interfaces.nsIHttpChannelInternal
Components.interfaces.nsIInputStream
Components.interfaces.nsIInputStreamPump
Components.interfaces.nsIInterfaceRequestor
Components.interfaces.nsIIOService
Components.interfaces.nsILocaleService
Components.interfaces.nsILocalFile
Components.interfaces.nsIMIMEService
Components.interfaces.nsIObserverService
Components.interfaces.nsIPrefBranch
Components.interfaces.nsIPrefLocalizedString
Components.interfaces.nsIPrefService
Components.interfaces.nsIPrivateBrowsingService
Components.interfaces.nsIProcess
Components.interfaces.nsIRequest
Components.interfaces.nsIScriptableInputStream
Components.interfaces.nsIScriptableUnicodeConverter
Components.interfaces.nsIScriptError
Components.interfaces.nsIScriptSecurityManager
Components.interfaces.nsISeekableStream
Components.interfaces.nsIServerSocket
Components.interfaces.nsIStringBundleService
Components.interfaces.nsIStringInputStream
Components.interfaces.nsISupports
Components.interfaces.nsIThreadManager
Components.interfaces.nsITimer
Components.interfaces.nsITransport
Components.interfaces.nsIUnicharInputStream
Components.interfaces.nsIUnicharLineInputStream
Components.interfaces.nsIURI
Components.interfaces.nsIURL
Components.interfaces.nsIWebBrowserPersist
Components.interfaces.nsIXMLHttpRequest
Restricted Application added a project: VisualEditor. · View Herald TranscriptMar 28 2016, 4:21 PM

Based on IRC discussion between @GWicke, @mobrovac and me, there seems to be agreement that if we are comfortable making the bet that the zotero translator interface won't drastically change (and it seems a reasonable bet to make), citoid should ditch zotero altogether instead of trying to port xulrunner interface which is a much bigger undertaking (because of all the references to internal components as documented in the previous comment). Citoid doesn't need the browser context that Zotero works with.

The translators provide doWeb, doSearch, doImport, doExport functions which zotero runs. Plus, the translators use a whole bunch of Zotero utilities which seems mostly standalone. It only uses 2 xulrunner components that seem easy to deal with.

  • uses a tree builder / dom parser and can be replaced with a node.js module -- should not be difficult.
  • vardump, a debug utility uses a typeof check against a component and can be fixed / removed.

I haven't looked at what other zotero functionality the translators use, but these seem to be bulk of what needs to be migrated over to node.js

If we want to go down the route of moving away from zotero itself, see below for a count of references to zotero code in the translators (some irrelevant items filtered out). So, it doesn't look too bad (looks like selctItems looks like a bug/typo in some translator there).

$ git grep "Zotero\." | sed 's/{/\n{/g;s/.*Zotero/Zotero/g;s/\(Zotero\.\w*\).*/\1/g;' | sort | uniq -c | sort -nr 
    772 Zotero.Utilities
    366 Zotero.debug
    307 Zotero.Item
    299 Zotero.selectItems
    242 Zotero.RDF
    198 Zotero.loadTranslator
    136 Zotero.wait
     93 Zotero.done
     39 Zotero.write
     37 Zotero.read
     20 Zotero.getOption
     17 Zotero.parentTranslator
     16 Zotero.nextItem
      5 Zotero.setCharacterSet
      5 Zotero.isBookmarklet
      5 Zotero.getXML
      4 Zotero.monitorDOMChanges
      4 Zotero.getHiddenPref
      4 Zotero.Collection
      3 Zotero.setProgress
      3 Zotero.nextCollection
      2 Zotero.addOption
      1 Zotero.selctItems
      1 Zotero.isMLZ
      1 Zotero.doGaleWeb
      1 Zotero.detectGaleWeb

Update with information from zotero devs:

https://groups.google.com/d/msg/zotero-dev/yy4-q_ZUA4M/UBOP9OHMEAAJ

On 10/4/16 2:49 PM, Marielle Volz wrote:
Do we have any further information about the timescale with this? I know 5.0 is in Beta and 5.1 is next up with changes to the fields. Is the move to Electron up next after that?

That's the general plan, but we'll likely try to migrate as much of the Standalone UI to React as we can while we're still on the Mozilla platform, since otherwise we'd be stuck maintaining two parallel branches while we tried to rewrite the entire program. We'll also probably try to create Electron-compatible shims for some XPCOM interfaces (e.g., nsIPrompt) so that we can share code without rewriting everything at once. So I don't really see Electron happening before the end of 2017, with various new features coming to the Mozilla-based version in the meantime).

cc @akosiaris

Change 317168 had a related patch set uploaded (by Mvolz):
Restructure so that Zotero can be disabled

https://gerrit.wikimedia.org/r/317168

Mvolz updated the task description. (Show Details)Oct 24 2016, 1:31 PM

Change 317168 merged by jenkins-bot:
Add config variable to disable Zotero queries

https://gerrit.wikimedia.org/r/317168

Change 319050 had a related patch set uploaded (by Mobrovac):
Deploy config: Add the zotero flag

https://gerrit.wikimedia.org/r/319050

Change 319050 merged by Mobrovac:
Deploy config: Add the zotero flag

https://gerrit.wikimedia.org/r/319050

mobrovac removed a subscriber: gerritbot.

The above patches only allow Citoid not to use Zotero, but in order to achieve the functional equivalence we still need it.

Mvolz removed Mvolz as the assignee of this task.Nov 28 2016, 1:44 PM
Mvolz updated the task description. (Show Details)
GWicke updated the task description. (Show Details)Nov 28 2016, 3:16 PM
Mvolz added a comment.May 15 2017, 3:06 PM

@mobrovac, @ssastry and myself just brainstormed ways to remove the dependency on XULRunner in the office. We agreed that it would be useful to spend a day or so investigating the feasibility of a node-only interface for the Zotero translators soon.
We are in the process of migrating all misc node services to Jessie, and XULRunner is not available on Jessie. To avoid blocking the move to Jessie & the conversion of sca nodes, we'll need to find a way forward in the next 1-2 months.

It is now more than 1-2 months out... is there any progress on this? TBH if I were to do this I'm not sure exactly where to start. The files are JS, but they are also not node; would we need to load the files maybe and interpret them or something?

GWicke moved this task from Backlog to watching on the Services board.Jul 12 2017, 5:23 PM
GWicke edited projects, added Services (watching); removed Services.
faidon added a subscriber: faidon.Aug 29 2017, 12:28 PM

Is there any progress on this not captured here? I saw that on the recent 5.0 announcement someone asked about the timeline of Electron support, only to get a response that there isn't one.

I'm pretty uninformed overall but it sounds to me that perhaps we should explore @ssastry and @GWicke's ideas of unentangling the translation-server from XUL ourselves?

czar awarded a token.Apr 2 2018, 2:20 AM
czar added a subscriber: czar.
Mvolz claimed this task.Jun 4 2018, 11:42 AM
Mvolz added a project: Epic.
Mvolz added a comment.Jun 14 2018, 3:45 PM

Declining in favour of T197242, which is to use a pure nodejs consumer of the translators (basically like what we considered but Zotero is doing it instead!)

Mvolz closed this task as Declined.Jun 14 2018, 3:46 PM