Page MenuHomePhabricator

Consider ways to handle free-form text responses
Open, MediumPublic

Description

Some researchers want free-form text responses. Posting them publicly creates a burden on oversighters and may cause privacy problems. What can we do?

Options:

  • Collect them publicly anyway (with suitable warnings)
  • Collect them, but keep the responses secret/only designated researchers can see them
  • Don't collect them (use other tools if you need that kind of response)

Event Timeline

@Whatamidoing-WMF I would double-check with legal on this. Just because the microsurvey is a MediaWiki extension doesn't necessarily mean that it needs to adhere to the same standards as AFT, or that oversighters need to be involved. Unless I missed something. I assume the data will be captured and stored privately and securely, and made available (if at all) after the fact. So from a legal perspective whether the survey responses will be public or not will be handled by the consent form, and our default is to let people know that they are releasing their responses into the public domain.

Reviewing a large number of free-text responses for PII, etc. is still a challenge, and makes it necessary to manually review any given set of survey data will be able to be released. That's a task for the researchers though. @leila and @DarTar experienced this challenge when they wanted to release the "Why we read Wikipedia" dataset. I don't remember what the outcome was, but they may be able to help.

But the vetting issue doesn't just apply to free-text responses. Depending on what forced-choice questions are being asked, and in what combination, and what metadata are collected and shared about each respondent, any non-aggregated survey data could be de-anonymized putting peoples' identities at risk. It needs to be a case-by-case thing.

Who from Research is involved in this project, btw?

Reviewing a large number of free-text responses for PII, etc. is still a challenge, and makes it necessary to manually review any given set of survey data will be able to be released. That's a task for the researchers though. @leila and @DarTar experienced this challenge when they wanted to release the "Why we read Wikipedia" dataset. I don't remember what the outcome was, but they may be able to help.

For that specific research project, the verdict was clear as the privacy statement of the survey would not allow us to share data before aggregation or anonymization which meant we could not publish the free form text responses. I must say that for the specific survey, this was an intentional decision given our resources. We could spend resources from Legal, Security/Privacy, and Research to figure out if we can have the data released without issues (and address this as part of the privacy statement), but that would mean many hours of consideration and deliberation and for a one-time (or multiple times) and relatively small scale survey we decided not to go down that path.

I hope this helps.

For a microsurvey – a little box on the screen, which shouldn't take up any more space than necessary – I don't really think it would be a good idea to have to explain about privacy policies, etc. before people can decide whether/how to respond.

I'd like to keep the approach simple enough and private-by-default enough that a little <Privacy policy> link in the box is sufficient for most users/most of the time. If free-form text responses were, for example, being posted on wiki with their usernames or IP addresses, then I think that we'd need a big warning in the box.

@Whatamidoing-WMF I'm not sure if you have seen samples of QuickSurvey widget we used for Why We Read Wikipedia. Legal helped us a lot with making sure we can fit the privacy statement in a small space available to us. The widget would look like something like this:

Screenshot from 2018-01-10 16-47-00.png (122×274 px, 12 KB)

I want it to look more like https://meta.wikimedia.org/wiki/Community_Liaisons/Process_ideas#Microsurveys (with a little link about privacy, and ideally multiple ways to make the box go away and leave you alone). It is important to me that the actual question be right in front of the user, without having to click on something to find out what the "one question" is.

It is important to me that the actual question be right in front of the user, without having to click on something to find out what the "one question" is.

Sure, the above was just an example for you to see how it looked in a small pop up like window with a link to Privacy. It's much better if the user has the functionality to ask the question and get responses without having to leave the context they're in.

Yes, a link about privacy (or the purpose of a survey more generally, including privacy) would be great. I do like the way that looks, although I very much hope that this will all be handled in-house, without the need to involve third parties.

I also want a reasonable level of privacy by default, because most respondents probably won't click to another page and read it. So if free-form responses are needed (i.e., by someone who is not me), then perhaps it should default to non-public. Does that sound feasible to you?

I also want a reasonable level of privacy by default, because most respondents probably won't click to another page and read it. So if free-form responses are needed (i.e., by someone who is not me), then perhaps it should default to non-public. Does that sound feasible to you?

That's the safest option you can go with, without having to spend a lot of iterations. I don't know what questions you expect to see asked in these widgets, depending on the nature of those questions, it may be a service to the broader community to give the ability to the user to publish the responses publicly if they choose to. That can make researchers happier in the future, and will allow them do nice things that we don't have bandwidth to even start imagining. :)

That's the safest option you can go with, without having to spend a lot of iterations.

+1

service to the broader community to give the ability to the user to publish the responses publicly if they choose to.

+1

I don't know what questions you expect to see asked in these widgets

Things like "What did you do before you came to this page here?" OR "What is the problem you try to solve now? Can you explain it so that a layperson would understand?" – basically open, qualitative questions.

I think @Verena, @Franziska_Heine, @TheDJ might be interested.

What I imagine this could enable is:

  • Getting qualitative, meaningful input…
  • …from a wide audience of people. For many (new or occasional editors) we currently have few qualitative data. That is bad!
  • get the input with a decent amount of work from us (no transcripts and travels) and from the community (no freeing time for a 1:30 interview)
  • Establish this as a micro-contribution: It is a help for the project if you answer these questions and we can visibly turn it into sense (if people choose their answers to be public, even together with the community)
  • More an internal thing: It seems to be framed as a research tool, which I hope makes it more sustainable.

(feel free to put it in the description, if you like)

matmarex subscribed.

Removing MediaWiki-Page-editing so that this doesn't clutter searches related to actually editing pages in MediaWiki. This project should only be on the parent task, probably (T89970). [batch edit]