Page MenuHomePhabricator

Enable pronunciation URL parsing for non-english wikis
Open, LowPublic

Description

Currently parsing of pronunciation URLs only works for English since it looks for File: in parseProperty.js.

Example: http://localhost:6927/ur.wikipedia.org/v1/page/mobile-sections/%D8%A2%D8%A6%DB%8C%D9%88%D8%B1%DB%8C_%DA%A9%D9%88%D8%B3%D9%B9 does not recognize the pronunciation URL.

To make this work for other languages we would need to have a list of File namespace prefixes for the various languages. I'm thinking it would be neat to bring over parts of the make-templates.py script from the Android repo but it would have to output JSON instead of the java files. The script main-page-names.py is a subset of this. Maybe we could make the main page names also be part of the JSON output. The key of this JSON object should be the language or a language code. We may have to special case language variants once we support them.

Event Timeline

@Mholloway to look into fleshing the solution for this out, before moving to backlog

Namespaces for a wiki are available in siteinfo, e.g., https://de.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2. If the endpoint is already getting siteinfo, this should be easy to add. Alternatively, we could run a script to fetch the names we need and load them into memory from a file at startup.

Aklapper lowered the priority of this task from Medium to Low.Jul 24 2023, 1:37 PM