Page MenuHomePhabricator

Unexpected extracts greek API response
Closed, ResolvedPublic

Description

Hello folks at Wikimedia,
I'm sorry to report that your extracts API seems to have an unexpected response on a particular page.

Keep up the great work 💪
Enrico

Unexpected response:

{"error":{"code":"missingtitle","info":"The page you specified doesn't exist.","docref":"See https://el.wikipedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."},"servedby":"mw1279"}

Expected response for missing page:

{"batchcomplete":true,"query":{"pages":[{"pageid":0,"missing":true}]}}

Expected response for existing page:

{"batchcomplete":true,"query":{"pages":[{"pageid":18,"ns":0,"title":"Ελληνικός","extract":"Σαν επίθετο ελληνικά/ελληνικός σημαίνει από, ή σχετικά με την Ελλάδα, το λαό της, ή την κουλτούρα της.\n\nΑρχαία ελληνική λογοτεχνία\nΕλληνική γλώσσα\nΕλληνική μυθολογία\nΕλληνική φιλοσοφία\nΕλληνικό αλφάβητο\nΕλληνικός καφές\nΕλληνικός πολιτισμόςΕλληνική: Πρωινή εφημερίδα Αθηνών από του 1925...."}]}}

Event Timeline

Anomie moved this task from Unsorted to Non-core-API stuff on the MediaWiki-Action-API board.
Anomie subscribed.

The extension's API module is throwing an error where it would likely be better to indicate the error with a "missing" boolean or the like so other parts of the query can continue.

It may also be that the extension is getting confused by the fact that the page with page_id 298785 on elwiki has title "ΒΠ:WDATA" in namespace 0, but since "ΒΠ" is an alias for the "Βικιπαίδεια" namespace there the title parses as page "WDATA" in namespace 4 instead. Either a bug allowed creation of the prefixed title in the main namespace, or the page was created before the alias and someone forgot to run namespaceDupes.php after adding the alias. Either way, namespaceDupes.php should fix it; file a Wikimedia-Site-requests but asking for that to be run.

It may also be that the extension is getting confused by [...]

Yeah, it looks like the extension is somehow or other re-parsing the title string even though it was passed a page ID. That should be fixed in TextExtracts.

@Ebonetti90 Could you check this was fixed after T215036 (the original report, not the TextExtracts bug).

Yep yep, seems solved on my side too, thanks!

I'll let you know if ever anything arises again on this 👍

Enrico

jcrespo assigned this task to Reedy.

I am going to be bold and close this as fixed, based on original reporter response, pending tasks could be fixed at parent T109238.

The textextracts issue, if persisting (unclear), should probably reported again on a separate tasks to avoid confusion with original reporting.