We want to create an Oozie job that generates article normalized scores in Hadoop.
- [[ https://gerrit.wikimedia.org/r/admin/projects/research/article-recommender | PySpark scripts ]]
- [[ https://meta.wikimedia.org/wiki/Research:Technology/Article-Recommendation-Pipeline-Overview | Production Pipeline overview ]]
- [ ] Take the PySpark scripts that work on stat1007 and turn them into an Oozie job.
- [ ] Setup the job to generate new recommendations quarterly or later when new Wikidata dumps are avaiable.