Understand TeaHouse desires of Newcomer-Quality predictions
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	notconfusing
	Oct 30 2018, 10:28 PM

Description

What do TeaHouse want in terms of AI metrics? What should such an AI optimize for? For instance is there a minimum false-positive rate that the TeaHouse participants would tolerate. Would 1%, 5% of invitees being bad-faith be acceptable?
In terms of invitee inclusion criteria, currently we predict for any newcomer that have made at least 1 edit. Would that be acceptable since HostBot currently relies on other heuristic criteria?

Jmo and I chatted about deploying session-based newcomer-quality ML model (NCM) for use in TeaHouse at CSCW2018
Jmo sounded encouraged by early results and was willing to have a go at using in NCM in HostBot instead of current heuristics in Hostbot1 (HB1), but there are a few obstacles to overcome:

Community Discussion and consent
- - Need to get consent from TeaHouse community
    - TODO, Max to post on TeaHouse discussion of possibility of using NCM, and would push conducting experiment. would be blind, so that hosts would not know whether NCM or HB1 would be producing recs
- TODO Post to include example list of Newcomer predictions for Hosts to sanity-check. Potentially Ask for input on Experiment detail
Experiment details
- - A/B test between (A) current heuristics and (B) NCM
  - day-randomized or alternating day.
  - Outcome variable is user retention (HostBots previously used metric)
  - TODO: Task find exact measurement and determine statistical test
  - NCM to subset based on current heuristics (not blocked, no level 4 warnings)
  - but importantly NOT minimum 5 edits, would be using minimum 1 session.
- HB1 takes top 300 heuristics to prevent crowding.
  - NCM to mimic top-300 recommendation
  - Training to maximize using precision at k metric, k=300.
- 4 week (approx) (maybe do some power analysis).
Further thoughts
- Could software engineering make HostBot run more frequently than current cron, to, for instance, detect after first sessions finishes. Max to be responsible for all engineering.
- May be best presented to the community as a CivilServant project

notconfusing triaged this task as Medium priority.Nov 15 2018, 5:22 PM