Page MenuHomePhabricator

Citoid service creates random citations for whitespace-only queries
Closed, ResolvedPublicBUG REPORT

Description

When I submit a single space in Citoid's automatic citation feature, it actually generates a seemingly random reference for me:

image.png (211×425 px, 33 KB)

This works on all wikis where this particular Citoid feature is enabled. The result is different on every wiki. Probably random.

The issues in isolation:

[
    {
        "itemType": "journalArticle",
        "DOI": "10.58809/gdnr8407",
        "title": "Empowering Schools and Family",
        "publicationTitle": "John Heinrichs Scholarly & Creative Activities Day",
        "date": "2017",
        "url": "https://doi.org/10.58809/gdnr8407",
        "accessDate": "2025-12-09",
        "author": [
            [ "Jude", "Loste" ],
            [ "null", "null" ],
            [ "null", "null" ],
            [ "null", "null" ],
            [ "null", "null" ],
            [ "null", "null" ]
        ],
        "source": [ "Crossref" ]
    }
]

Event Timeline

Change #1216604 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/citoid@master] Fix whitespace-only search returning unexpected results

https://gerrit.wikimedia.org/r/1216604

Change #1216601 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Citoid@master] Don't allow to submit whitespace-only values

https://gerrit.wikimedia.org/r/1216601

Plausibly a regression in the back end surfacing on the front. FYI mobrovac hasn't worked for the foundation in many years and Mooeypoo hasn't worked on the VE team in many years, so there's not much point in tagging either :).

I'll look into it, but I suspect the null values are actually in the crossRef search results (where this is coming in). We should definitely make sure we aren't sending "null" literals to them though.

I've been considering doing some sort of levenshtein distance cut-off for crossRef results but this should definitely be caught even before that.

Mvolz triaged this task as Medium priority.Dec 9 2025, 11:59 AM
Mvolz moved this task from Backlog to Next & Doing on the Citoid board.

Change #1216601 merged by jenkins-bot:

[mediawiki/extensions/Citoid@master] Don't allow to submit whitespace-only values

https://gerrit.wikimedia.org/r/1216601

I added some people just to make it visible that they worked on the code. Please feel free to remove them in case this is not helpful.

I had a brief look at the relevant code in the citoid service and my impression so far is that these "null" strings are actually coming from Zotero like that.

Change #1216784 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/citoid@master] Dramatically streamline creator names loop

https://gerrit.wikimedia.org/r/1216784

Change #1216604 merged by jenkins-bot:

[mediawiki/services/citoid@master] Fix whitespace-only search returning unexpected results

https://gerrit.wikimedia.org/r/1216604

Change #1216784 merged by jenkins-bot:

[mediawiki/services/citoid@master] Dramatically streamline creator names loop

https://gerrit.wikimedia.org/r/1216784