Page MenuHomePhabricator

Decide on proper error handling for the transferToES.py job
Closed, DeclinedPublic

Description

The transfer to ES script runs in hadoop and takes the scores we calculate in hadoop, such as the popularity score or page rank, and ships them over to the elasticsearch cluster. This script currently skips error handling, we need to decide what should be appropriate.

Event Timeline

EBernhardson raised the priority of this task from to Needs Triage.
EBernhardson updated the task description. (Show Details)
EBernhardson subscribed.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript

Update on this: current script does detect and count both failed transfers and failed documents, but does not do anything (no retries, etc.) with it. We may want to improve it.

Deskana subscribed.

The first full import is currently running, i think we can decide how much effort to put into this based on the stats it collects. Perhaps we can just punt on this until it actually becomes an issue.

@EBernhardson started work on this, but had to put it back into the backlog due to other work being prioritised, so I'm unassigning them. It might be worth pinging him if someone picks this up.

This was decided to not really be worth it. Marking as declined.