Transition citoid to use Zotero's translation-server-v2
Closed, ResolvedPublic

Description

Update citoid to use the nodejs version of translation-server, translation-server-v2: https://github.com/zotero/translation-server (old translation-server now at https://github.com/zotero/translation-server-old)

Repo on gerrit: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/zotero

Related Objects

There are a very large number of changes, so older changes are hidden. Show Older Changes
Krenair added a subscriber: Krenair.EditedAug 9 2018, 8:51 PM

Hopefully this is the right place for my questions (sorry if not): So I'd like to get rid of the one remaining deployment-prep trusty host, deployment-zotero01, but I've also been told prod is running it on sca+scb hosts which are (well, at least one is) jessie (on a related note what's the deal there, why are there no scb hosts in deployment-prep)? Is setting this up on jessie/stretch just a case of making a new instance and applying role::zotero? Can I then just update references to point at the new host and kill the ol done? Is that even the appropriate action to take or should I put it on an sca host like prod?

Change 451835 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/citoid@master] spec.yaml: Remove the bibtex x-ample

https://gerrit.wikimedia.org/r/451835

Change 451835 merged by Mobrovac:
[mediawiki/services/citoid@master] spec.yaml: Remove the bibtex x-ample

https://gerrit.wikimedia.org/r/451835

Mentioned in SAL (#wikimedia-operations) [2018-08-10T09:32:45Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@983d80c]: Remove the bibtex spec.yaml x-ample - T197242

Mentioned in SAL (#wikimedia-operations) [2018-08-10T09:35:49Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@983d80c]: Remove the bibtex spec.yaml x-ample - T197242 (duration: 03m 04s)

Hopefully this is the right place for my questions (sorry if not): So I'd like to get rid of the one remaining deployment-prep trusty host, deployment-zotero01, but I've also been told prod is running it on sca+scb hosts which are (well, at least one is) jessie (on a related note what's the deal there, why are there no scb hosts in deployment-prep)? Is setting this up on jessie/stretch just a case of making a new instance and applying role::zotero? Can I then just update references to point at the new host and kill the ol done? Is that even the appropriate action to take or should I put it on an sca host like prod?

The sca hosts are running Zotero in production, and they are on trusty. Until this ticket is resolved, we have to keep trusty instances around in both production and beta, there is no other way around it, unfortunately. The good news, though, is that we are making progress and expect to be able to get rid of trusty soon(TM).

Change 452933 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] Resolve DOIs/URLs all the way to end

https://gerrit.wikimedia.org/r/452933

Change 452933 merged by jenkins-bot:
[mediawiki/services/citoid@master] Resolve DOIs/URLs all the way to end

https://gerrit.wikimedia.org/r/452933

Mentioned in SAL (#wikimedia-operations) [2018-08-27T13:58:31Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@fe96789]: Resolve DOIs/URLs all the way to end - T197242

Mentioned in SAL (#wikimedia-operations) [2018-08-27T14:04:08Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@fe96789]: Resolve DOIs/URLs all the way to end - T197242 (duration: 05m 37s)

I noticed that a recent translators update was abandoned. From a comment to that patch, it sounded like there will be no more updates from Zotero's translator repo until this task is done. If that's correct, is there any estimate of when that will be? I have just gotten a few more translators for Swedish news sites added to Zotero's repo and would like to get them working with Citoid (T204467).

Mvolz updated the task description. (Show Details)Sep 24 2018, 12:09 PM

I noticed that a recent translators update was abandoned. From a comment to that patch, it sounded like there will be no more updates from Zotero's translator repo until this task is done. If that's correct, is there any estimate of when that will be? I have just gotten a few more translators for Swedish news sites added to Zotero's repo and would like to get them working with Citoid (T204467).

I'm afraid I can't give you much better than "soon" but if you wish to follow along, this is dependant on T201611 being done first.

Change 463713 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] [WIP] Update citoid to work with new translation-server

https://gerrit.wikimedia.org/r/463713

Change 450565 abandoned by Mvolz:
[WIP] translation-server-v2 update

Reason:
Abandon in favour of Ib3cff5e7eda2a7e61efb9944c86f7cf9d9604ecb

https://gerrit.wikimedia.org/r/450565

I noticed that a recent translators update was abandoned. From a comment to that patch, it sounded like there will be no more updates from Zotero's translator repo until this task is done. If that's correct, is there any estimate of when that will be? I have just gotten a few more translators for Swedish news sites added to Zotero's repo and would like to get them working with Citoid (T204467).

I'm afraid I can't give you much better than "soon" but if you wish to follow along, this is dependant on T201611 being done first.

Also, the current idea is to start using the upstream translators directly, so I would suggest submitting your changes to the Zotero translators repository directly.

Also, the current idea is to start using the upstream translators directly, so I would suggest submitting your changes to the Zotero translators repository directly.

Sounds good. The translators I've written are in the Zotero repo already.

Mvolz updated the task description. (Show Details)Nov 5 2018, 10:10 AM

Change 474451 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[mediawiki/services/zotero@master] Add ca-certificates to apt packages

https://gerrit.wikimedia.org/r/474451

Change 474451 merged by Alexandros Kosiaris:
[mediawiki/services/zotero@master] Add ca-certificates to apt packages

https://gerrit.wikimedia.org/r/474451

Change 474473 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[mediawiki/services/zotero@master] Move ca-certificates to the production variant

https://gerrit.wikimedia.org/r/474473

Change 474473 merged by Alexandros Kosiaris:
[mediawiki/services/zotero@master] Move ca-certificates to the production variant

https://gerrit.wikimedia.org/r/474473

Mvolz added a comment.Mon, Nov 26, 6:58 PM

As per chat, this is scheduled for deploy Monday Dec 03.

Change 477498 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[operations/puppet@production] Citoid: Switch to using Zotero v2

https://gerrit.wikimedia.org/r/477498

Change 463713 merged by Mobrovac:
[mediawiki/services/citoid@master] Update citoid to work with new translation-server

https://gerrit.wikimedia.org/r/463713

Mentioned in SAL (#wikimedia-operations) [2018-12-04T09:46:41Z] <akosiaris> disable puppet on scb for citoid migration to zoterov2 T197242

Change 477498 merged by Alexandros Kosiaris:
[operations/puppet@production] Citoid: Switch to using Zotero v2

https://gerrit.wikimedia.org/r/477498

Mentioned in SAL (#wikimedia-operations) [2018-12-04T09:50:22Z] <akosiaris> enable puppet on scb2001, run puppet T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T09:51:41Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb2001 - T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T09:52:11Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb2001 - T197242 (duration: 00m 30s)

Mentioned in SAL (#wikimedia-operations) [2018-12-04T09:54:50Z] <akosiaris> enable puppet on all scb2*, run puppet T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T10:01:18Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 in codfw - T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T10:03:03Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 in codfw - T197242 (duration: 01m 45s)

Mentioned in SAL (#wikimedia-operations) [2018-12-04T10:43:01Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@8abcbda]: Disable Citoid test for switching it to Zotero v2 - T211088 T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:03:59Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@8abcbda]: Disable Citoid test for switching it to Zotero v2 - T211088 T197242 (duration: 20m 59s)

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:17:04Z] <akosiaris> enable puppet on scb1001, run puppet T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:18:26Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1001 - T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:18:56Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1001 - T197242 (duration: 00m 30s)

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:31:52Z] <akosiaris> enable puppet on scb1002, run puppet T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:33:14Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1002 - T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:33:42Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1002 - T197242 (duration: 00m 28s)

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:34:11Z] <akosiaris> enable puppet on scb1003, run puppet T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:35:41Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1003 - T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:36:01Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1003 - T197242 (duration: 00m 20s)

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:36:44Z] <akosiaris> enable puppet on scb1004, run puppet T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:38:20Z] <mobrovac@deploy1001> Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1004 - T197242

Mentioned in SAL (#wikimedia-operations) [2018-12-04T11:38:41Z] <mobrovac@deploy1001> Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1004 - T197242 (duration: 00m 21s)

mobrovac closed this task as Resolved.

Citoid in production has been switched to use Zotero v2.

Mvolz reopened this task as Open.Tue, Dec 4, 2:31 PM

Re-opening due to T211114

Change 477566 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] scb: Fix zotero config typo

https://gerrit.wikimedia.org/r/477566

Change 477566 merged by Alexandros Kosiaris:
[operations/puppet@production] scb: Fix zotero config typo

https://gerrit.wikimedia.org/r/477566

Seems like after this has been done the citation alerts started flapping much more then they used to. Also, the mean latency for citations endpoint went up from seconds to minutes.

Seems like after this has been done the citation alerts started flapping much more then they used to. Also, the mean latency for citations endpoint went up from seconds to minutes.

Just for posterity's sake, graph is at https://grafana.wikimedia.org/dashboard/db/restbase-external-overview?panelId=17&fullscreen&orgId=1&from=1543930211205&to=1543947035600 (courtesy of @mobrovac). The bump is pretty clear. The spikes themselves are probably the result of some OOMkills and timeouts issued by kubernetes because of memory usage violations of the limits imposed to the pods. I 've increased that limits and increase the number of pods. The spikes themselves should probably go away, but remains to be seen.

The overall bump in mean latency however does exist. Looking at:

vs

the mean latency has gone from 1.3s to 4.0s. That's a bump of 200%.

Even now with some changes to memory limits and pod count, the mean is over 200% increased (for ~1s to ~3s)

FWIW we 've had a number of minor outages and alerts resulting in increased latency for results. The corresponding graph can be seen here https://grafana.wikimedia.org/dashboard/db/restbase-external-overview?panelId=17&fullscreen&orgId=1&from=1544017415835&to=1544026700314

I 've increased the zotero pod limits and it seems to have subsided. The current theory is some GC being unable to free up enough memory due to the above limit.

I'm not getting the correct response when using the translators I've written. A few fields, such as "creator", are missing.

E.g.
https://en.wikipedia.org/api/rest_v1/data/citation/mediawiki/https%3A%2F%2Fwww.svt.se%2Fnyheter%2Flokalt%2Fost%2Fkronobranneriet
returns

[{
	"key": "5YDXNRNS",
	"version": 0,
	"itemType": "webpage",
	"tags": [],
	"title": "Arkeologer gräver efter brännvin",
	"websiteTitle": "SVT Nyheter",
	"date": "2018-02-27",
	"url": "https://www.svt.se/nyheter/lokalt/ost/kronobranneriet",
	"abstractNote": "Nu blottläggs den första politiska stridsfrågan i brännvinsbränningens historia.",
	"language": "sv",
	"accessDate": "2018-12-05",
	"source": [
		"Zotero"
	]
}]

When I run it locally (with the latest Zotero version) I get:

[{
	"key": "8HKCH8N2",
	"version": 0,
	"itemType": "newspaperArticle",
	"creators": [{
		"firstName": "Lena",
		"lastName": "Liljeborg",
		"creatorType": "author"
	}],
	"tags": [],
	"title": "Arkeologer gräver efter brännvin",
	"date": "2018-02-27",
	"url": "https://www.svt.se/nyheter/lokalt/ost/kronobranneriet",
	"abstractNote": "Nu blottläggs den första politiska stridsfrågan i brännvinsbränningens historia.",
	"language": "sv",
	"libraryCatalog": "www.svt.se",
	"accessDate": "2018-12-05T16:18:19Z",
	"section": "Öst"
}]

I'm not getting the correct response when using the translators I've written. A few fields, such as "creator", are missing.

@Sebastian_Berlin-WMSE would you mind opening a new ticket for your issue? This ticket is about production-side issues related to the switch.

Re: latency, unfortunately somewhat slower performance is to be expected here, since v1 used Firefox's native DOM parsing and the Node-based v2 uses JSDOM. We have some hacks in place to work around severe performance problems that JSDOM had on some pages, and they have some upcoming fixes that should fix that and hopefully speed things up in general.

We haven't ruled out some alternative approaches using Headless Chromium or Electron, but for the time being this will remain a bit slower than v1. Hopefully performance is still acceptable for most usage.

Re: latency, unfortunately somewhat slower performance is to be expected here, since v1 used Firefox's native DOM parsing and the Node-based v2 uses JSDOM. We have some hacks in place to work around severe performance problems that JSDOM had on some pages, and they have some upcoming fixes that should fix that and hopefully speed things up in general.

Have you looked at Domino - https://github.com/fgnass/domino ?

Change 478027 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/citoid/deploy@master] Beta: Do not use Zotero

https://gerrit.wikimedia.org/r/478027

Change 478027 merged by Mobrovac:
[mediawiki/services/citoid/deploy@master] Beta: Do not use Zotero

https://gerrit.wikimedia.org/r/478027

Have you looked at Domino

We looked at Domino briefly and found some alarming parsing problems, though we didn't investigate further. (That translator uses XPath, so could be some weird interaction with the XPath package we're using.)

In any case, it looks like JSDOM is getting dramatically faster in the next version — some websites went from 77000ms to 350ms — so we'll update to that once it's out.

cscott added a subscriber: cscott.Thu, Dec 6, 10:49 PM

If you could provide more details, I'd certainly be interested in helping debug the XPath library interaction. Domino is pretty heavily performance-optimized at this point.

Mvolz closed this task as Resolved.Tue, Dec 11, 6:00 PM