The transfer to ES script runs in hadoop and takes the scores we calculate in hadoop, such as the popularity score or page rank, and ships them over to the elasticsearch cluster. This script currently skips error handling, we need to decide what should be appropriate.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | EBernhardson | T121055 Deploy oozie jobs to calculate popularity score and ship it to elasticsearch | |||
Declined | None | T121056 Decide on proper error handling for the transferToES.py job |
Event Timeline
Comment Actions
Update on this: current script does detect and count both failed transfers and failed documents, but does not do anything (no retries, etc.) with it. We may want to improve it.
Comment Actions
The first full import is currently running, i think we can decide how much effort to put into this based on the stats it collects. Perhaps we can just punt on this until it actually becomes an issue.
Comment Actions
@EBernhardson started work on this, but had to put it back into the backlog due to other work being prioritised, so I'm unassigning them. It might be worth pinging him if someone picks this up.