Page MenuHomePhabricator

Parsoid dies if trying to transform to not-yet-supported language variant
Closed, ResolvedPublic

Description

In case I'm trying to convert to existing, but not yet supported language variant, like zh-hans Parsoid dies with 500 error:

curl -i -H 'Accept-Language: zh-hans' https://zh.wikipedia.beta.wmflabs.org/api/rest_v1/page/html/首页

Parsoid log:

Cannot read property 'getMachine' of undefined
TypeError: Cannot read property 'getMachine' of undefined
at Function.baseToVariant (/srv/deployment/parsoid/deploy-cache/revs/cf83c3580d0286c7e8a752b2fb9543bda0fb9aaf/src/lib/language/LanguageConverter.js:417:13) 
at Function.maybeConvert (/srv/deployment/parsoid/deploy-cache/revs/cf83c3580d0286c7e8a752b2fb9543bda0fb9aaf/src/lib/language/LanguageConverter.js:386:8)
at Object.apiUtils.languageConversion (/srv/deployment/parsoid/deploy-cache/revs/cf83c3580d0286c7e8a752b2fb9543bda0fb9aaf/src/lib/api/apiUtils.js:527:21) ...

Instead of dying it probably should just ignore the transformation.

Event Timeline

Pchelolo triaged this task as High priority.Jun 22 2018, 1:11 PM
Pchelolo created this task.
Restricted Application added subscribers: Cosine02, Aklapper. · View Herald TranscriptJun 22 2018, 1:11 PM
Pchelolo updated the task description. (Show Details)Jun 22 2018, 1:12 PM
Arlolra assigned this task to cscott.Jun 22 2018, 1:24 PM

Instead of dying it probably should just ignore the transformation.

Indeed. I think the best course of action here would be to simply strip the A-L header and proceed. Additionally, perhaps a Warning header could be emitted in the response to indicate that the A-L was ignored.

cscott added a comment.EditedJun 22 2018, 4:29 PM

Crashing is certainly a bug, which I'll fix today.

WRT what to do if the conversion requested is a no-op -- this can be for a number of reasons, including that the variant requested is actually the base variant, the page contains a __NOCONTENTCONVERT__ magic word, the variant requested is unrecognized, etc. My approach has been to keep the A-L header -- so that there aren't migration problems if/when the variant requested is supported, the __NOCONTENTCONVERT__ magic word is removed, etc -- but return the unconverted content. And that's what I'll land for zh-hans too (which is a slightly unusual case in that Parsoid has zh-hans support, it's just on an undeployed branch still).

Probably instead of a warning header, Parsoid should use its existing logging functionality to record the unsupported variant? Would an HTTP header be useful to you, and if so, would you like to suggest one?

Crashing is certainly a bug, which I'll fix today.

Great, thnx!

WRT what to do if the conversion requested is a no-op -- this can be for a number of reasons, including that the variant requested is actually the base variant, the page contains a __NOCONTENTCONVERT__ magic word, the variant requested is unrecognized, etc. My approach has been to keep the A-L header -- so that there aren't migration problems if/when the variant requested is supported, the __NOCONTENTCONVERT__ magic word is removed, etc -- but return the unconverted content. And that's what I'll land for zh-hans too (which is a slightly unusual case in that Parsoid has zh-hans support, it's just on an undeployed branch still).

I do agree that there are cases where the conversion will not / cannot happen. And obviously, Parsoid should ignore the header in that case. I was suggesting to drop it merely because of the assumption that the pre-langvar code could be employed as-is in that case. Maybe I'm wrong, but whichever works for you, works for us as well.

Probably instead of a warning header, Parsoid should use its existing logging functionality to record the unsupported variant? Would an HTTP header be useful to you, and if so, would you like to suggest one?

I think we'd need a Warning header (or similar) precisely because of migration problems and clients' awareness. On the one hand, if the client supplies the A-L but doesn't get back the expected content format, they at least have the Warning header to inform them (a.k.a. let's have a user-friendly API). On the other hand, having the Warning header (possibly logged in both Parsoid and RB) can inform us as to how the API is used and if/how would we or clients need to change things to improve the contract.

Change 441564 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Ensure Parsoid doesn't crash on unimplemented/invalid language conversions

https://gerrit.wikimedia.org/r/441564

Change 441564 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Ensure Parsoid doesn't crash on unimplemented/invalid language conversions

https://gerrit.wikimedia.org/r/441564

cscott closed this task as Resolved.Jun 22 2018, 10:04 PM

Mentioned in SAL (#wikimedia-operations) [2018-06-25T20:02:37Z] <cscott> Completed deploy of Parsoid to version b068bb51 (T197949, T197702, T196799)

Vvjjkkii renamed this task from Parsoid dies if trying to transform to not-yet-supported language variant to qgaaaaaaaa.Jul 1 2018, 1:02 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed cscott as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
ssastry renamed this task from qgaaaaaaaa to Parsoid dies if trying to transform to not-yet-supported language variant.Jul 1 2018, 3:35 AM
ssastry closed this task as Resolved.
ssastry assigned this task to cscott.
ssastry updated the task description. (Show Details)