Before we can implement validation, we need to agree upon a definition of a valid config. This spike is to do just that.
Description
Event Timeline
@Sfaci is this the task you mentioned where there are some growthbook config changes coming ?
This is one of those tasks. If you take a look at the related Epic there are several subtasks and all are related to how the new Test Kitchen API will be and work when integrated with GrowthBook. I would say that, at least for now, there is no a specific one more relevant for you. I'm also working on preparing the corresponding ones for T422376: [EPIC] Customize GrowthBook that could be also interesting for you. In general, I would say that these two epics encompass the more relevant work from your perspective: This one is related to the Test Kitchen API, which will fetch/validate/transform experiment configs from GrowthBook API, and the other is focused on changing the GrowthBook UI to be more aligned with our needs
| Field | Type | Notes |
|---|---|---|
| name | string | |
| slug | string | trackingKey for GrowthBook experiments |
| description | string | |
| phabricator_task_url | URL | customFields.phabricator_task for GrowthBook experiments |
| owner | string | The name of the team that owns the experiment |
| contact_email | string | The email address of the team or a member of the team that owns the experiment |
| phase_index | integer | {0, 1, 2, 3, …}<br><br>The index of the latest phase for GrowthBook experiments. Always 0 for TK UI experiments |
| utc_start_dt | Date | |
| utc_end_dt | Date | |
| user_identifier_type | string | |
| traffic_split | Map<string,Map<string, float>> | A map of wiki ID to a map of group name to absolute percentage of traffic. phases[$phase_index].trafficSplit for GrowthBook experiments |
| stream_name | string | customFields.stream_name for GrowthBook experiments. Defaults to product_metrics.web_base |
| contextual_attributes | string[] | customFields.contextual_attributes for GrowthBook experiments |
| risk_level | integer | customFields.risk_level for GrowthBook experiments |
| security_review_url | URL | customFields.security_review for GrowthBook experiments |
| version | string | Currently, used to invalidate in-browser exposure event cache. In future, could be used to track experiment config propagation |
| is_active | boolean | Always true for GrowthBook experiments |
| source | string | growthbook for GrowthBook experiments. test_kitchen for TK UI experiments |
An experiment config is valid if:
- slug
- Matches ^[A-Za-z0-9][-_.A-Za-z0-9]{7,62}$
- Is unique
- phase_index is:
- An integer
- >= 0
- utc_start_dt aligns with the beginning of an experiment deployment window
- utc_end_dt is
- After utc_start_dt
- Aligns with the beginning of an experiment deployment window
- trafficSplit:
- Has at least one entry
- Keys:
- Every key is a valid wiki ID
- Values:
- Every has at least one entry
- Every key matches ^[A-Za-z0-9][-_.A-Za-z0-9]{0,62}$
- Every value value is in the range [0, 1)
- risk_level
- If is_active is truthy:
- Is not null
- Matches the security level computed from contextual_attributes
- If is_active is truthy:
- stream_name must refer to a stream that is defined in $wgEventStreams
- security_review_url must be present if risk_level is not PENDING
@phuedx Thanks for writing this up. I have a few questions:
- Where is the set of wikis the experiment will run on defined?
- Do we need to record here whether an experiment is cache splitting, or will that be encoded in the user_identifier_type ?
- Do we want to require utc_end_dt to be set, or is null acceptable and we will we pick a default value (eg, 3 weeks from start) if none is stated? Since the GrowthBook default doesn't include an end date, we may see users define experiments with a null value for end date.
- There isn't a listed field for a security and legal review. Should we add a custom field for the review and require one be present if certain contextual attributes are present? I want to make sure we're providing appropriate controls to prompt users to get security and legal review when it's needed and am uncertain whether this is the right place to encode that logic.
Good catch! I've added the wikis field.
- Do we need to record here whether an experiment is cache splitting, or will that be encoded in the user_identifier_type ?
I'm not sure. Ultimately, this is how it's going to be modelled in GrowthBook. However, I'm not happy with overloading the field but then I'm also not convinced that adding another field (which will be false for all other identifier types) is the right way to go either.
- Do we want to require utc_end_dt to be set, or is null acceptable and we will we pick a default value (eg, 3 weeks from start) if none is stated? Since the GrowthBook default doesn't include an end date, we may see users define experiments with a null value for end date.
Requiring an end date matches the model that we currently have. Varnish, in particular, requires an end date. Since we won't get an end date from GrowthBook, it makes sense that the GrowthBookExperimentMarshaller (for example) – the thing that creates experiment models from GrowthBook API responses – will have to generate one.
- There isn't a listed field for a security and legal review. Should we add a custom field for the review and require one be present if certain contextual attributes are present? I want to make sure we're providing appropriate controls to prompt users to get security and legal review when it's needed and am uncertain whether this is the right place to encode that logic.
Good idea. I like that this approach enables both the validation of Security and Legal review being present and the contextual attributes matching the given risk level.
@phuedx Just a few questions/comments:
- What about adding some validation for stream_name for experiments that came from GrowthBook. That validation exists already in TK UI because stream names are pre-filled in the corresponding field and only existing ones can be selected by the user. I assume we cannot do that in the GrowthBook UI so the backend Validator could use the API Action as a source of truth to check that stream name is a valid one (that's what we use to pre-fill the field in TK UI). In Test Kitchen UI we use this way also to get the corresponding schema_title
- I would add that risk_level + contextual_attributes shouldn't be validated is the selected risk level is "Risk assessment pending". It's a current validation rule that Test Kitchen has right now. That also means that the experiment cannot be activated (A specific risk level must be selected to be able to do it). We should deactivate experiments from GrowthBook for cases like this one.
- Related to the above, a "Security and legal review" must be added in the case the risk level above is `Medium" or "High"
- We currently have some validation rules for the human-friendly name of an experiment. It must be at least 8 characters long and not longer than 63 characters. Do we want to keep those rules? To be hones I don't remember why we set those rules but I guess that we wanted to define some kind of limits
- We should also check that machine-readable name is actually unique (At least for a while we will have two platforms to register experiments and it's something that could happen and we cannot check in isolation in any of those platforms)
hi @phuedx - just trying to wrap my head around wikis, traffic, phases, groups - do you envision like the following?
{
"name": "Experiment Name,
"slug": "experiment-name,
...
"phase_index": 0,
...
"user_identifier_type": "edge-unique",
"traffic_split": {
"phases": [
0: [
"0.01": [
"enwiki"
],
"0.1": [
"frwiki",
"ptwiki"
],
"default": 0
],
1: [
"0.1": [
"arwiki",
"bnwiki"
],
"default": 0
],
},
"groups": [
"control",
"treatment"
],
...
}now i think my Q doesn't make sense - this is about fields in GB that we will poll
but we do want to incorporate phases in the responses right?
@Sfaci: Great points. I've addressed all of them in T422528#11874722 (hopefully). Would you mind taking another look?
On the contrary. It was a good question and I'm glad that you played with the shape to see if the model is expressive enough.
Re. phases, groups, and wikis:
- I think we only ever need to be concerned about the current phase (but we do need to know what phase we're talking about!); and
- Groups, wikis, and traffic allocation can be combined into a more expressive structure: trafficSplit is a map of wiki ID to a map of group name to traffic allocation
For example: Consider an everyone experiment running on enwiki with an allocation of 1% of traffic and with two groups, control and treatment:
trafficSplit: {
enwiki: {
control: 0.005,
treatment: 0.005
}
}Up until now, our model has assumed that the total traffic allocation for the experiment is divided equally between the groups. This isn't as flexible as GrowthBook's model, which allows for groups to have different traffic allocations (thereby enabling bandit experiments where traffic allocations are adjusted over time). This field is more like GrowthBook's model and can express our model also.
Done! Just one small thing about the last rule you posted there.
security_review_url must be present if risk_level is not PENDING
security_review_url is only needed when risk_level is MEDIUM or HIGH. Low risk doesn't require any security review