Page MenuHomePhabricator

Adjust invitations from HostBot to editors whose accounts are at least 3 days old
Closed, ResolvedPublic

Description

We've been noticing a lot of mentorships have not been getting off the ground, largely because learners drop off quite soon. Even when they are matched, and their mentors attempt to engage them, they do not respond. For example:
*https://en.wikipedia.org/wiki/User_talk:Harvey1257
*https://en.wikipedia.org/wiki/User_talk:Jackheart314
*https://en.wikipedia.org/wiki/Wikipedia_talk:Co-op/Shiteshsachan
*https://en.wikipedia.org/wiki/User_talk:JamusDoore

In total, I've documented 15 cases where a mentor initiates discussion with a learner, but the learner does not reply. Many of these cases seem to be with editors who are very new to Wikipedia. Conversely mentorships that have resulted in interactions appear to with learners who are a little more established:

*https://en.wikipedia.org/wiki/Wikipedia_talk:Co-op/Komchi
*https://en.wikipedia.org/wiki/User_talk:Negative24
*https://en.wikipedia.org/wiki/User_talk:Christopher2625649908
*https://en.wikipedia.org/wiki/User_talk:Acad1989

@Soni and I suspect that mentorship is perhaps better suited for editors who have done a little editing and had their accounts longer than a day. I also think that mentors should not be investing their time in editors who do not appear likely to reply.

For these reasons, we think it would be best to have HostBot restrict sending invitations to editors with accounts that are at least three days old, and have edited recently (however that is defined by HostBot currently) I'm open to other suggestions about edit counts, as these were somewhat arbitrary numbers that Soni and I developed, but we do want to better target editors who are likely to benefit from mentorship, of which many from our current pool do not due to attrition.

@Halfaker - I'm CCing you on this as well, as I suspect this will reduce the number of invitations we send out, and likely, the number of incoming editors for any studies that are being done with Co-op invitees in mind.

Event Timeline

I_JethroBT assigned this task to Capt_Swing.
I_JethroBT raised the priority of this task from to Low.
I_JethroBT updated the task description. (Show Details)
I_JethroBT added a project: Co-op.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 2 2015, 12:22 AM
I_JethroBT updated the task description. (Show Details)Apr 2 2015, 12:23 AM
I_JethroBT set Security to None.
I_JethroBT updated the task description. (Show Details)
I_JethroBT moved this task from Backlog to Front End on the Co-op board.Apr 2 2015, 12:28 AM
I_JethroBT moved this task from Front End to HostBot on the Co-op board.
Soni added a comment.Apr 2 2015, 12:45 AM

The choice of the "3 days" number is pretty arbitrary for now, but I'm hoping Aaron can weigh in with some drop-off statistics to help decide exactly how old newcomers have to be before they are best suited for mentorship. (One possible parameter I have in mind is that after a certain time period, the rate of drop off of newcomers should be marginally lower)

Alternately, we could also consider browsing through Snuggle and compare newcomers based on how old their account is, and try to judge their suitability for the Co-op.

Overall, the idea is that if we send invites reasonably around the right time period for newcomers, the effectiveness of the Co-op for mentorship should be significantly increased.

Halfak added a comment.Apr 2 2015, 1:44 PM

Hey guys. It seems like we have a nice problem here. We'd like to maximize the number of newcomers we can invite to the Co-op, but we'd like the newcomers we invite to be substantially invested.

Right now, the proposal is ambiguous. Would an editor who only made one edit, but has an account that is 3 days old be invited? Most likely not. It seems like we'd want editor who made at least one edit 3 days after registering. Is that right?

I think we can do better than the proposal. There's some fun modeling work that could be done here, but without going through that much trouble, we can use past research to set some good thresholds. https://meta.wikimedia.org/wiki/Research:Newcomer_survival_models I'm going to assume that we want an ~80% retention rate (20% hazard of churn). We should invite newcomers who have:

  • 4+ edits or
  • 2+ edit sessions or
  • 30+ minutes of time spent editing or
  • at least one edit 1+ hours after registration

It would be pretty easy for me to build something to calculate this and submit a pull request to HostBot if that sounds acceptable.

Soni added a comment.Apr 2 2015, 2:25 PM

+1

J-Mo, can you please confirm what is the current criterion the Hostbot uses for sending the invites? Ideally, the new thresholds should be more rigorous than the current ones (Because the general idea is that we should invite editors more invested in Wikipedia than currently), so we might want to revisit the retention rate percentage, but overall that idea looks solid.

@Halfak Thanks for these more specific suggestions and providing that prior research as a basis for them. I support them. I also agree with @Soni that it would be helpful to know what criteria HostBot is currently using (which, as I recall, are the same as invitees for the Teahouse)

I should also note that the other factor that likely prevent mentorships from taking off is the wait, which has happened with a number of editors, largely because they are selecting several categories or writing in topical interests not specified by the instructions in profile creating instead of just one. This prevents HostBot from making a match. For example:

I have had to manually edit these parameters to initiate the match, but I and others can only spend so much time patrolling for these cases.

@I_JethroBT, @Soni I can update the Co-op invites so that they are sent to recently autoconfirmed editors. I have some old code to do this, which can be re-purposed. I would take me about a day to set up. Would that work for you?

I've updated the bot, and as soon as we re-start the Co-op invites they will go out to recently autoconfirmed editors.

Capt_Swing closed this task as Resolved.Apr 15 2015, 9:55 PM

@Capt_Swing - Thanks for the update (and sorry I didn't get back to you yesterday). I think this will be an improvement, but @Soni and I were concerned that many editors who are recently autoconfirmed may have made a bunch of edits all at once and then dropped off. We're also a little concerned that 10 edits is quite small. Having done some evaluation of our mentorships so far, it appears that mentorship seems better suited for editors who are much more invested than this. Many new editors seem to be dropping off right away, even when a mentor contacts them.

We were wondering if there might be a time where different parameters like edit count, account age, and recency of their last edit could be easily adjusted in HostBot. I can make another task for this, but I wanted to ask you first about whether creating these parameters in HostBot is actually doable.

It's already very configurable. Here's the query that's currently being used: https://github.com/jtmorgan/hostbot/blob/master/hb_queries.py#L35

If there's someone on your team that knows SQL, you can play with the parameters. If you'd like access to the HostBot db, I can make you a member of the project and walk you through the codebase so that you can experiment with different parameters. But if any changes have to go through me, your ability to experiment will be limited by my time, skill, and will as a volunteer.

BTW, you can also test different 'sampling' strategies via http://quarry.wmflabs.org/.

With the current sampling (autoconfirmed users), there will be about 20 people per day invited to the Co-op. That's taking into account the fact that about 80% of the people who meet the criteria in the query I linked to above will have received a Teahouse invitation already. If you are okay with inviting people who have previously been invited to the Teahouse to the Co-op as well, you can potentially make your sampling criteria more strict (by increasing the required edit count, say) and still probably find 20-50 people per day to invite.

However, having received a Teahouse invite would be a confounding factor in any analysis you want to perform around retention from Co-op mentorship. Also, I would have to start tracking Co-op invites in a different database than Teahouse invites, which would require some effort to set up, probably 8 hours all told when you factor in testing.