Page MenuHomePhabricator

Oozie job for merging click data with DBN relevance scores
Closed, DuplicatePublic

Description

Result table should include:

  • wiki search was performed on
  • normalized search query
  • page id of the result
  • relevance label

Details

Related Changes in Gerrit:

Event Timeline

Perhaps this should simply be the output of the DBN job (T162056)? Not sure, but on review it seems like we have quite a few intermediate data steps that might not be necessary. On the other hand I think it's likely we want some intermediate steps, so if one of those steps has a problem we only have to run from there forward in the pipeline, rather than running the whole pipeline from the beginning.

Change 347038 had a related patch set uploaded (by EBernhardson):
[search/MjoLniR@master] Add DBN training

https://gerrit.wikimedia.org/r/347038

attached patch is only half the work though, this part adds the DBN training to mjolnir, but it doesn't setup the oozie half of the pipeline. I wonder if we should be have separate tasks for these, as the oozie pipelines will only make sense once most of the initial code in mjolnir is ready to start running a complete pipeline.

Change 347038 merged by Tjones:
[search/MjoLniR@master] Add DBN training

https://gerrit.wikimedia.org/r/347038