In T210844: Generate article recommendations in Hadoop for use in production we'll be generating TSV files that live in HDFS. We want to take these TSV files and import them into the m2-master database.
Analytics is working on a solution to take data from the analytics cluster to the production servers at T213976: Workflow to be able to move data files computed in jobs from analytics cluster to production .
We have import scripts that take TSVs and import them into MySQL. Where should these scripts be executed from?
Summary of discussion so far
How to import article recommendation scores to MySQL from Hadoop?
- Transfer data to a production node that has access to the MySQL database and run the import script ...
- Write directly from Hadoop to MySQL. SRE what are the issues here?
- Running import scripts from stats machines or database host machines has been ruled out.
- Decide where and how to execute the import scripts.
- Implement the agreed-upon solution.