Page MenuHomePhabricator

Prototype a bot framework that utilizes newcomerquality
Open, LowestPublic

Description

Notes to remember

  1. thresholds to use
    1. excess to be in a control group
  2. mobile database to use
  3. design to be API only, so not reliant on DB replicas (more portable)
  4. how to handle login in the prototype phase
  5. newuser-check-frequence
  6. run test that logs but not edits for a week
  7. actions-would-take table (Decisions table)

NB: https://phabricator.wikimedia.org/T211160

Event Timeline

Threshhold analysis from: http://localhost:8888/notebooks/determine%20bot%20thresholds.ipynb

# to get 150 per day/ 165 to have some buffer and for control group
dfss.groupby('user_registration_day').agg(lambda df: threshold_to_get_top_n(df,160,'pred_min'))['user_id']

user_registration_day
2018-11-30 Fri   0.884070
2018-12-01 Sat   0.842765
2018-12-02 Sun   0.865676
2018-12-03 Mon   0.906331
2018-12-04 Tue   0.907539
2018-12-05 Wed   0.886038

How did I miss Thursday? OK...Tired

AT worst cut everything at 0.84 Otherwise these 'pred_mins'

Using min instead of mean, is better because it's A) stricter and only 30% of users end up going multi-session

updates:

  • using a local mysql database
  • using a (dev with ssh tunnel)

still todo:

  • exponential backoff for each stage in hostbot_ai.py - https://github.com/litl/backoff
  • wednesday - write send_invite
  • how does host-bot actually post to wiki, can it be done with mwapi?
  • thusday run with RQ
  • friday - set up on wikidump parse
  • friday - optimize new user query to utilize user-index
  • setup mysql with files in actual disk, not nfs
  • saturday - launch 2-week experiment to run but printing to test
  • setup airbrake?

I've only had very general conversations about this project so far, and didn't realize there was working code :-)

@notconfusing Would you mind dropping a link to the bot repo, and letting us know whether it's something you might pursue, or if we should pick up the thread?

@notconfusing, do you have a repo for this that you could link us to? I think we could take it from there if you are busy with other stuff.