Comments from Trello:
Translate python script prototype to Spark/Scala to speed up computation and make use of GraphX library for getting connected components
Approach 1: Connected Components
Concept Graph: G(E, V)
V: (project, article) pairs wikidata items
E: wikidata links, redirects, interlanguage links
Find connected components in G that do not have an article in the source language but no in the target language.