To avoid duplication in storage and cache fragmentation we've recently enabled title normalisation in RESTBase. The normalisation completely repeats the process happening in MediaWiki. Incoming requests with non-normalised titles are redirected to normalised versions which creates additional latency for the clients.
A normalised title uses canonical localised namespace name and title in the dbKey format with underscores instead of slashes. More details in T127144 and https://github.com/wikimedia/mediawiki-title
However, looking at the logs, android app and mobile content service use non-normalised title versions some times. Here's the list of what I could find:
- Use of non-localised namespace. Example Wikipedia:Página_principal normalised to Wikipédia:Página_principal (note the stress above e). On pt.wikipedia.org, user agent WikipediaApp/2.1.142-beta-2016-03-07 (Android 4.4.4; Phone)
- Use of fragments in the URI. Example List_of_minor_planets:_257001–258000#201 - note the #201 part. Although it's completely legal, for RESTBase it doesn't matter at all whether you're using a fragment or not, returned content is the same. So it's better to strip it out right away. User agent: WMF Mobile Content Service
- Non-trimmed whitespaces. Example: WSXGA_ -> WSXGA. User-agent: WMF Mobile Content Service
I will monitor the logs for some time and will possibly append to this list. However, this is not a major problem, the rate of incorrect requests is fairly low, and even an erroneous request is not a big problem as it's just a redirect.
Issues #2 and #3 will get resolved automatically when we switch on actually redirecting in RESTBase as backend services will only get pre-normalised titles. However for #1 is still not clear, where does Android app get this title.