We have a rough diagram of the imbalances in translation flow,
https://commons.wikimedia.org/wiki/File:WikipediaTranslationAssistant_translation-directions_22-07.png
Using these resources:
- public, Content Translation data source
- the example R code and rendered Sankey diagram
- an open source programming language like R or Python (and the relevant libraries you need for the task)
Please do this:
- refine and extend the illustration to bring out additional details. For example, leave out English to see the relationships between the remaining languages. Create a diagram of a smaller subset of languages that show intriguing imbalances or balances.
- Experiment with other types of diagram, for example a scatterplot of languages with x=translations from and y=translations to; diagrams using the ratio of translations from and translations to, etc.
- Harder task: Try making a scatterplot of language translation ratio against the wiki article count. Attention: There might not be a convenient data source for this.










