Background
By implementing T274520: Move Growth configuration to on-wiki JSON file, we will transfer some control of whether a feature is enabled or not to the communities. Right now, when we change a feature state on a wiki, we can note that internally, and interpret changes in data from reports using this information. When communities will be able to turn features on or off, we will no longer be able to note it ourself, and thus we need a process to capture when was a feature turned on or off, so we can count with that when analyzing.
Requirements
We need to be able to know when was a feature enabled or disabled, or when was other configuration of GrowthExperiments changed. This task aims to decide how we want to store this information, as well as implementing the decided solutions.
Possible solutions
- Create a new EventLogging schema, similar to PrefUpdate, but for wikis
- Store current configuration state in all of our current eventlogging schemas
- Use native MediaWiki history and the JSON blob itself.
Analysis of option 1
This requires a new schema being created, reviewed and deployed to analytics systems. On the other hand, it stores less data than option 2, and makes it easy to create a derrived dataset like option 2.
Analysis of options 2
Stores a lot of additional data, as changes of configuration probably won't be frequent.
Analysis of option 3
This is the only option that wasn't discussed previously. The configuration will live in an on-wiki JSON file, that will have a form so communities can make edits easily. That means we will be provided with a lot of information by MediaWiki history already (see https://cs.wikipedia.org/w/index.php?title=MediaWiki:NewcomerTasks.json&action=history for an example of how it could work). In my (@Urbanecm_WMF) opition, it isn't hard to fetch one of the old JSON files to get the data as needed, or derrive datasets that would be provided by options 1 or 2, using only MediaWiki history.
This does not require any additional work, and stores no additional information. On the other hand, the data will not be directly inside analytics systems, and it might be harder to work with them.
@nettrom_WMF I would appreciate your opinions on whether this would work for you.