Page MenuHomePhabricator

getFrequentLanguageList: expected behavior for 'redirected' languages
Open, NormalPublic

Description

The ULS method getFrequentLanguageList() leverages the getAutonym() method during the process to "make flat, make unique, and ignore unknown/unsupported languages" the result it returns - presumably trusting that getAutonym() returns the language code itself if the language "is unknown/unsupported".

Codes of 'redirected' languages, e.g. "fil" redirecting to 'tl', yield the autonym of the language it is redirected to when the getAutonym() method is called.

Consequently codes for redirected languages are not groomed by getFrequentLanguageList() - possibly resulting in return values containing e.g. both 'fil' and 'tl'.

Is this expected behavior?

This was found as part of the research for T217770

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 8 2019, 12:05 PM

@Nikerabbit Uncertain what is the correct process on getting clarification regarding ULS topics I'm taking the liberty to bring this ticket to your attention. Could you shed some light onto this please, or kindly point me to the right person to ask?

It's tricky. On one hand we want to preserve the original language codes to not mess up expectations. On the other hand the current behavior is not wanted either.

I don't know the answer right now. I'll return hopefully next week with more thoughts.

@Nikerabbit Sorry to poke you as an individual again (please point me to the process, if possible): is there any update about this? Please mind that this ticket is not a change request but an inquiry about the expected behavior. In theory a boolean answer would suffice for us to decide if we have to compensate for this output or if it can/will be changed upstream eventually.

Some archeology in ULS code bases has lead us to https://phabricator.wikimedia.org/T51847 and the related code change https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/UniversalLanguageSelector/+/69613/.
Based on our interpretation of the said bug report, and in particular the behaviour of the new code in https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/UniversalLanguageSelector/+/69613/7/resources/js/ext.uls.init.js@123 (implementation details later changed, but behaviour seem to not have changed), we believe to have established that getFrequentLanguageList should include fil language code, in case it is provided by any of numerous data sources considered by the method.
In cases where both fil and tl are provided by the sources of language codes, both fil and tl should be included in the result of getFrequentLanguageList
It seems then, that filtering out language codes which are redirects should be the responsibility of the code using getFrequentLanguageList.

@Nikerabbit and other exerts from UniversalLanguageSelector team could you please confirm the above description is not off?

BTW @Arrbee, in this case we've been asking @Nikerabbit personally, which, admittedly, is rather hostile approach. What would be your team's preferred way to submit questions like this one. I've failed to find this piece of information on your team's wiki page: https://www.mediawiki.org/wiki/Wikimedia_Language_engineering.
Again, I'd like to underline that we have not claimed there is a bug. It has not been clear what is the expected behaviour, and the related documentation, including the tests documenting the code have not been providing the clear answer.

BTW @Arrbee, in this case we've been asking @Nikerabbit personally, which, admittedly, is rather hostile approach. What would be your team's preferred way to submit questions like this one. I've failed to find this piece of information on your team's wiki page: https://www.mediawiki.org/wiki/Wikimedia_Language_engineering.

Tagging the Language team along with the project code is good start to have our attention. However, depending on our work schedule we will be able to suggest when we can help. From what I have read so far, this query came up as part of your research. Can you let us know what is the urgency associated with this request?

Again, I'd like to underline that we have not claimed there is a bug. It has not been clear what is the expected behaviour, and the related documentation, including the tests documenting the code have not been providing the clear answer.

@Pginer-WMF @Amire80 - could you please check if we can help here in any way?

Much thanks for coming back to us @Arrbee! Apologies for the late response from my end.

Tagging the Language team along with the project code is good start to have our attention. However, depending on our work schedule we will be able to suggest when we can help.

Thanks, we will exercise this next time we come across questions in the domain of your team.

From what I have read so far, this query came up as part of your research. Can you let us know what is the urgency associated with this request?

Given the discovery made by investigating the code and its history we've mentioned in one of above comment, the request here is of medium urgency.
At our current work at WMDE we are no longer blocked on this question. It does still seem useful for current and future ULS users to have more clarity on the designed behaviour of the said method, so you having a look into it would be undoubtedly appreciated.