Page MenuHomePhabricator

Citoid returns mangled results for Washington Post articles
Closed, ResolvedPublic

Description

If you use the Citoid API to look up metadata for Washington Post articles, it tends to return mangled results for authors.

For example https://www.washingtonpost.com/national-security/2020/06/11/pentagons-top-general-apologizes-appearing-alongside-trump-lafayette-square/ returns:

[
  {
    "key": "T3JXGTQX",
    "version": 0,
    "itemType": "webpage",
    "tags": [],
    "title": "Pentagon’s top general apologizes for appearing alongside Trump in Lafayette Square",
    "websiteTitle": "Washington Post",
    "url": "https://www.washingtonpost.com/national-security/2020/06/11/pentagons-top-general-apologizes-appearing-alongside-trump-lafayette-square/",
    "language": "en",
    "accessDate": "2020-06-11",
    "author": [
      [
        "Dan Lamothe closeDan LamotheReporter covering the",
        "Pentagon"
      ],
      [
        "the U. S.",
        "militaryEmailEmailBioBioFollowFollow"
      ]
    ],
    "source": [
      "Zotero"
    ]
  }
]

The URL https://www.washingtonpost.com/dc-md-va/2020/06/07/dc-black-lives-matter-defund-police/ returns even weirder results and breaks the look-up feature in VisualEditor (giving the error "We couldn't make a citation for you. You can create one manually using the 'Manual' tab above.")

Not sure if the problem is on our end or Zotero's.

Event Timeline

I have the same problem using Visual Editor, but it seems to come and go. The most common result is 'couldn't make a citation for you.' Occasionally it's mangled results.

@Mvolz - Is this something we can control on our end or is it from Zotero?

@Mvolz - Is this something we can control on our end or is it from Zotero?

It's either that we have an outdated translator or the translator needs to be fixed upstream.

Mvolz triaged this task as Low priority.Jul 7 2020, 8:29 AM

@Mvolz - This has been fixed upstream. Can we update it now on our end or do we have to wait for a package release?

@Mvolz - This has been fixed upstream. Can we update it now on our end or do we have to wait for a package release?

It's a submodule, so we'll need to update the submodule for translation-server upstream, and then update our local repo of translation-server. I'll submit a pr to do it upstream once this is merged since it makes sense to do it all at once: https://github.com/zotero/translators/pull/2137

@Mvolz - Looks like your patch got merged and the submodules got updated as well. Can we update our local repo now?

Change 618271 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/zotero@master] Update Zotero to b0a30f98c

https://gerrit.wikimedia.org/r/618271

Change 618271 merged by jenkins-bot:
[mediawiki/services/zotero@master] Update Zotero to b0a30f98c

https://gerrit.wikimedia.org/r/618271

@Mvolz - Now that the code is merged locally, what happens next? Does someone need to deploy the new version to the production servers?

@Mvolz - Now that the code is merged locally, what happens next? Does someone need to deploy the new version to the production servers?

I'll deploy it in the next deploy window (thurs).

Hmm, it still seems to be broken.

Change 621482 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[operations/deployment-charts@master] Update Zotero

https://gerrit.wikimedia.org/r/621482

Change 621482 merged by jenkins-bot:
[operations/deployment-charts@master] Update Zotero to b0a30f98c

https://gerrit.wikimedia.org/r/621482

Change 621686 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[operations/deployment-charts@master] Update zotero to 2020-08-07-190051-production

https://gerrit.wikimedia.org/r/621686

Change 621686 merged by jenkins-bot:
[operations/deployment-charts@master] Update zotero to 2020-08-07-190051-production

https://gerrit.wikimedia.org/r/621686

Mvolz removed a project: Patch-For-Review.