We want to create an Oozie job that generates article normalized scores in Hadoop.
A/C
- Take the PySpark scripts that work on stat1007 and turn them into an Oozie job.
- Setup the job to generate new recommendations quarterly or later when new Wikidata dumps are avaiable. Waiting on T209655: Copy Wikidata dumps to HDFS + parquet.