- The {maptpx} LDA for topic modeling needs to be replaced in WDCM.
- Main reason: the {maptpx} R package is not actively maintained anymore (for some time already).
- The current alternative under consideration: MALLET LDA implementation in {SpeedReader} R package.
Also:
- our results indicate that neither perplexity or BF based model selection yields human-interpretable topics of Wikidata items (some attempts to fix this are followed in T203238);
- coherence measures will be introduced in model selection; as we change the algorithm we run, the time is right to implement this step too.
If this fails for any reason, we either stick to {maptpx} and introduce coherence measures there, or migrate Spark's mllib implementation of LDA.