Page MenuHomePhabricator

Handling multiple affiliations (at once; like work vs spare time) in tech community metrics
Closed, DeclinedPublic

Description

One contributor may have multiple affiliations at once i.e. Wikimedia Foundation during working hours, Independent in their personal time. This is currently not reflected in our tech community metrics, where only one affiliation is reported at a time.

The question is, how can the software compute when a person is contributing under affiliation A and when under affiliation B (C, D...)?

The most discrete and accurate solution would be to base different affiliations in different accounts / email addresses. This rational argument is sometimes met with social resistance. However, if we want to implement multiple affiliations, we will need to either resolve this discussion or find an alternative method.

Event Timeline

Qgil raised the priority of this task from to Low.
Qgil updated the task description. (Show Details)
Qgil added a project: wikimedia.biterg.io.
Qgil added subscribers: Qgil, Krenair, Dicortazar.

Currently multiple affiliations can be assigned to a user, but for different time periods (i.e. until January 2015 Mary was Independent, from February 2015 Mary is affiliated with WMDE).

As long as it can handle multiple affiliations intersecting, that's fine.

Multiple affiliations at the same time should probably be tied to different usernames / email addresses, otherwise there is no way to tell.

That's no good, WMF contributions often have generic usernames/email addresses associated with them.

However, this discussion has a weak relation with this task. If you want to improve the current situation with affiliations, please create a new task in order to discuss the details properly.

You can't really do any useful statistics involving user affiliations while this problem is unsolved, because any metric assuming that (for example) "all of @Krenair's contributions made since March last year are for WMF" is going to be simply invalid.

Qgil raised the priority of this task from Low to Medium.Jul 3 2015, 11:50 AM

The WMF Annual Plan 2015-16 includes a goal related to this task:

Set and monitor code review KPIs for all community-sourced contributions

Solving this task will help identifying "community-sourced".

Do we have an idea about how many affected users we talk here (probably not, so "potentially everybody who we also suppose to work for some company")?

While a single person might differentiate for Phabricator by having different accounts, I have a hard time to imagine that a single person sets up separate Gerrit/wikitech/LDAP accounts and switches their local Git/Gerrit configuration (email address) everytime in order to intentionally differentiate when pushing a proposed patch to Gerrit.

And at least for WMF I'm not aware of a social requirement to change their email address in Gerrit from a private to an "official" one when people start to get paid for (some of) the patches they create.

So technical efforts within MetricsGrimoire required to differentiate which technical contributions by a single person were done under some staff capacity and which ones under volunteer capacity might be in no justifiable relation to the importance of making the data more reliable, I'm afraid.

I see three issues here:

  • (1) Defining (in the real world) which activity of a person is considered "affiliated activity" (that is, activity as a member of a given organization), and "independent activity" (that is, activity that she performs on her own).
  • (2) Finding which traces can be found in the repositories, which allow to link that real world definition with those traces. This allows to tell if a given activity (eg, a commit) is "affiliated" or "independent".
  • (3) Deciding how to represent that activity in charts and stats. For example, is it reasonable to count that person as "two persons" while she is contributing both as affiliated and independent, or as one? Should she appear twice in the listings, eg, in the list of people contributing?

In other cases we have been working with, for (1) it has been enough too consider time periods. That is, at any given time period (spanning at least several days), a given person is either affiliated to some organization, or individual. And he cannot be at more than one organization at the same time. This usually satisfies companies that want to account for the activity of their employees. And the criteria is basically "is hired by".

In the case of the WMF community, it seems this is not good enough. But in that case you need to care about how to define when activity is due to the organization, and when it is individual. And you need to make the organizations, the individuals, and the community, to have a consensus on this. Because otherwise, some developer could consider that some activity is done as individual, while the employer considers that as affiliated activity. I'm not sure if using different email addresses or accounts is enough, because the developer may forget to switch, and/or empolyer may differ on whether the developer should switch for a certain activity.

In any case, depending on the criteria you follow for (1), then we need to map that to traces in the reporitories (2). Using different identities (email addresses, accounts) is of course traceable, so that could be a valid option. But that forces to you switch, and decide for every action which identity to use (and not forget). Consider for example writing in IRC or answering an email message. This may mean that in some cases the "ideal" definition for the real world is not practical because it is untraceable, or very likely to get wrong, which forces to reconsider (1).

And then, once we have (1) and (2) fixed, we have to discuss how we consider people in the database, and how we represent them in the charts, listings, etc. Up to now, because of this single relationship from person to organization (considering "independent" as an organization), we always have "one person in the database" for each "real person" at any given time. Which makes thinks simple. But if this is not enough, we will need to first decide how those "multiply affiliated persons" should be shown, and then figure out how to express that in the database, and all the tools querying it.

Examples I found after looking at user affilations in korma, to fix once this task has been resolved:
SG/shahyar, werdna, Erik Moeller aren't with WMF anymore.
Bene was for a while with WMDE, before and after he's Independent.
Daniel Werner isn't with WMDE anymore.
Denny went from WMDE to Google.
Amir does some Independent work and also some work for WMDE.

Aklapper renamed this task from Handling multiple affiliations in tech community metrics to Handling multiple affiliations (at once; like work vs spare time) in tech community metrics.Oct 20 2015, 3:20 PM
Aklapper set Security to None.

Ignore my last (off-topic) comment that should have gone to T112527 instead.

Aklapper lowered the priority of this task from Medium to Low.Nov 9 2015, 11:20 AM
Aklapper lowered the priority of this task from Low to Lowest.Jan 6 2017, 1:25 PM

If there is a problem with the current situation/data feel free to elaborate.
I don't see any need for any implementation changes, if wanted (for which reason though?) we can always separate identities in the database.