Page MenuHomePhabricator

xLab: Update ‘Project and sample size’ module
Closed, ResolvedPublic3 Estimated Story Points

Assigned To
Authored By
Sarai-WMF
Apr 29 2025, 3:10 PM
Referenced Files
F61782501: Screenshot 2025-06-06 at 12.21.31.png
Jun 6 2025, 10:23 AM
F61279360: Screenshot 2025-06-02 at 21.07.22.png
Jun 2 2025, 7:12 PM
F60058543: Screenshot 2025-05-16 at 11.27.54.png
May 16 2025, 9:39 AM
F60017389: Screenshot 2025-05-15 at 15.53.31.png
May 15 2025, 1:54 PM
F60011022: Screenshot 2025-05-15 at 13.07.30.png
May 15 2025, 11:07 AM
F59551440: Screenshot 2025-04-29 at 18.01.36.png
Apr 29 2025, 4:08 PM
F59548688: Screenshot 2025-04-29 at 15.17.02.png
Apr 29 2025, 3:10 PM
F59548625: Screenshot 2025-04-29 at 15.13.33.png
Apr 29 2025, 3:10 PM

Description

Problem

xLab currently provides a “Project and sample size” field that allows defining the size of the user sample to be enrolled in each one of the wikis involved in their experiment. This field presents some of issues:

  1. Selecting “projects” is inaccurate, since enrollment takes place on the specific domains of individual wikis (e.g., the wiki available in en.wikipedia.org wiki instead of Wikipedia as a project)
  2. Project selection is based on dbnames, which isn't completely user-friendly (e.g., "eswiki" instead of "Spanish Wikipedia");
  3. The notion of "Sample size" is an improvement from the previous "Sample rate" (see MPIC Usability Evaluation Findings), but we could make the concept even clearer if we used terminology and input values that are more accessible for users
Suggested solution

We should improve the “Project and sample size” module, applying the following adjustments:

  1. We replace the notion of ‘project’ by ‘wiki’ everywhere in the module (see Copy adjustments)
  2. Menu options display full, user-friendly wiki names instead of dbnames
  3. ‘Sample size’ is replaced by ‘Traffic’, which value is now inputted as a percentage from 1 to 100.
  4. Moreover, the update of the “Project and sample size” field to ”User traffic per wiki” justifies amending the title of the current “Sampling” section to, instead, “Traffic and variations”.

Screenshot 2025-04-29 at 15.17.02.png (1×1 px, 185 KB)

Overview of changes:

Copy adjustmentsMenu adjustmentsValidation
Screenshot 2025-04-28 at 20.34.55.png (1×2 px, 289 KB)
Screenshot 2025-05-16 at 11.27.54.png (852×1 px, 158 KB)
Screenshot 2025-05-15 at 15.53.31.png (647×1 px, 93 KB)
Labels, placeholders and descriptions should be adjusted to reflect the switches to the notions of ‘wiki’ and ‘traffic’.1. To facilitate locating the right option, the wiki selection menu will only display a max. of 7 suggestions by default. The existence of further options is indicated by the scroll bar displayed by the menu. 2. The menu displays full, user-friendly wiki names.Validation scenarios: 1) The Traffic field will be validated on the client side if users enter a value greater than 100; 2) The Traffic field will be validated on submit if users leave it empty when a corresponding wiki has been selected; and 3) The Wiki field will be validated on submit if left empty when the corresponding traffic has been selected.

Find more details in the Figma specifications

Notes
  • The description of the new ‘User traffic per wiki’ module includes a link to a documentation page that doesn’t exist yet. We decided to leave it in the designs and specs as a reminder of the relevance of providing more information and directions to users while filling in this field ( e.g. regarding the recommended amount of traffic they should select per wiki).
  • The overall 'User traffic per wiki' field validation will be implemented as part of T372952: (stretch) xLab: Add better validation to forms. We'll just have to adjust some of the logic (e.g. due to the usage of percentages) as well as the copy of the error messages as part of this ticket.
Acceptance criteria
  • The ‘Project and sample size’ field in xLab’s experiment and instrument forms is updated to match the specifications of the ‘User traffic per wiki’ field
  • The title of the ‘Sampling’ section is updated to 'Traffic and variations' in the experiment form, and to 'Traffic' in the instrument form

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Add and update traffic and variations field setsrepos/data-engineering/mpic!180cjmingT392899+T392911/user-id-traffic-variationsmain
Customize query in GitLab

Event Timeline

Milimetric added subscribers: cjming, Milimetric.

moving to in-progress and making it explicit that Claire's parent task depends on this work. @Sarai-WMF if you feel there's anything left to work on here, do bring it up in next week's design review (I won't be there). But to me it all looks good. Do make sure to coordinate with @cjming

thanks @Sarai-WMF @Milimetric - so we're moving ahead with this ticket and implementing the design changes?

mpopov subscribed.

Related to but not directly blocked by / dependent on T391955: Spike: xLab: Domains and dblists

Will likely push it to the next sprint

JVanderhoop-WMF raised the priority of this task from Low to High.May 22 2025, 3:28 PM
Milimetric set the point value for this task to 3.May 27 2025, 3:27 PM

hi @Sarai-WMF - quick Q - for the 2nd AC - The title of the ‘Sampling’ section is updated to 'Traffic and variations' -- is this for both instrument + experiment forms or just experiment?

hi @Sarai-WMF - quick Q - for the 2nd AC - The title of the ‘Sampling’ section is updated to 'Traffic and variations' -- is this for both instrument + experiment forms or just experiment?

Hey @cjming 👋🏻 My appologies for that: the AC is only referring to the experiment form. My proposal in case of the instrument form would be to update the 'Sampling' section's title to simply 'Traffic'. For consistency with the experiment form's section (but avoiding the mention of 'Variations', which don't apply there). The problem is that I don't think we got to validate this suggestion with the broader team. If the name change in the instrument form sounds good to you, I'd say we can go ahead while I quickly ask for consensus in parallel. How does that sound? I'll update the AC to reflect my suggestion

Change #1152759 had a related patch set uploaded (by Santiago Faci; author: Santiago Faci):

[operations/deployment-charts@master] xLab: Deploying xLab v0.6.4 to staging

https://gerrit.wikimedia.org/r/1152759

Change #1152759 merged by jenkins-bot:

[operations/deployment-charts@master] xLab: Deploying xLab v0.6.4 to staging

https://gerrit.wikimedia.org/r/1152759

hi @Sarai-WMF we deployed the latest changes to the staging environment - we still have to do some minor cleanup, especially around validation - but if you want to take a look and let us know if current state passes your design review eagle eyes:

https://mpic-next.wikimedia.org/create-instrument
https://mpic-next.wikimedia.org/create-experiment

Thanks again, @cjming! Stunning work and velocity! Besides the pending validation changes, I'd say that the following adjustments are needed:

  1. Looks like some unexpected options are being displayed by the menu. Some sound like dbnames (e.g., 'arbcom_ruwiki', 'azwikimedia') or don't look like user-facing projects (e.g., 'Audit Committee')
  2. The wiki selection menu should only display a max. of 7 suggestions by default. This is to facilitate parsing the results. We can use Codex Menu's visibleItemLimit prop to easily achieve this (docs).
  3. I'm noticing some unexpected matching behavior too. It looks like, in some cases, the results aren't listed alphabetically in accordance with the input. In the screenshot, for example, the alphabetical match ('Spanish Wikicionario') for the input 'sp' is displayed on the 8th position. If you agree, this is something I'd recommend we investigate separately, to avoid blocking this task:

Screenshot 2025-06-02 at 21.07.22.png (314×619 px, 30 KB)

thanks @Sarai-WMF ! there's a pending MR to redo some of the population of the wikis list - we have the data now (i.e. the mobile domains for wikis) and I'm wondering if you already solved for how to distinguish between desktop and mobile domains?

Is there another field or checkbox we should add for inclusion/exclusion of mobile domains? If we're using proper Wikipedia project names in the chip input select list, how do we capture whether the experiment owner wants it running on desktop, mobile, or both?

thanks @Sarai-WMF ! there's a pending MR to redo some of the population of the wikis list - we have the data now (i.e. the mobile domains for wikis) and I'm wondering if you already solved for how to distinguish between desktop and mobile domains?
Is there another field or checkbox we should add for inclusion/exclusion of mobile domains? If we're using proper Wikipedia project names in the chip input select list, how do we capture whether the experiment owner wants it running on desktop, mobile, or both?

While creating the latest specs for this field's update, I was operating with the information that the wiki options presented in the menu included both desktop and mobile domains by default (i.e. 'English Wikipedia' = 'en.wikipedia.org' + 'en.m.wikipedia.org'). Thus, the update of the field's description to "Traffic from both desktop and mobile webs will be included by default." We reached this conclusion after several exchanges during design review.
If this isn't the case anymore, and we'd like for users to be able to select device-based domains, I can dig up some of the design options we evaluated at the time.

ah ok - somehow i missed that important piece -- following the convo in Slack since I have the same Q about different rates between desktop and mobile

  1. Looks like some unexpected options are being displayed by the menu. Some sound like dbnames (e.g., 'arbcom_ruwiki', 'azwikimedia') or don't look like user-facing projects (e.g., 'Audit Committee')

This has been addressed by a recent MR that has been deployed to staging and production.

  1. The wiki selection menu should only display a max. of 7 suggestions by default. This is to facilitate parsing the results. We can use Codex Menu's visibleItemLimit prop to easily achieve this (docs).

Addressed in https://gitlab.wikimedia.org/repos/data-engineering/mpic/-/merge_requests/188 - I'll let you know when it's up on staging so you can verify

  1. I'm noticing some unexpected matching behavior too. It looks like, in some cases, the results aren't listed alphabetically in accordance with the input. In the screenshot, for example, the alphabetical match ('Spanish Wikicionario') for the input 'sp' is displayed on the 8th position. If you agree, this is something I'd recommend we investigate separately, to avoid blocking this task:

Screenshot 2025-06-02 at 21.07.22.png (314×619 px, 30 KB)

Hmm - I'm not sure what we can do about that - we'd have to dig into the component. If you think it's problematic, then I think a new task for it makes sense

Since I moved outstanding issues into a new ticket T396045: xLab: follow up tasks for traffic and variations section, I"m going to err on calling this ticket to done.

@Sarai-WMF when the current MR for fixes is reviewed/merged, I'll let you know on T396045 when you can do design review for the remaining issues

Sounds good. Thanks for the fixes, @cjming! I think I realized what the "problem" is with:

  1. I'm noticing some unexpected matching behavior too. It looks like, in some cases, the results aren't listed alphabetically in accordance with the input. In the screenshot, for example, the alphabetical match ('Spanish Wikicionario') for the input 'sp' is displayed on the 8th position. If you agree, this is something I'd recommend we investigate separately, to avoid blocking this task:

Screenshot 2025-06-02 at 21.07.22.png (314×619 px, 30 KB)

Hmm - I'm not sure what we can do about that - we'd have to dig into the component. If you think it's problematic, then I think a new task for it makes sense

As long as the user input matches any part of the available options, they'll be displayed following the internal alphabetical order of the menu.

Screenshot 2025-06-06 at 12.21.31.png (356×678 px, 37 KB)

So, options aren't reordered based on the match between the query and the start of the wiki name. I think there might be more to lose if we addressed this (i.e., less search flexibility, being able to find a wiki based on project and/or language), so I'd vote to keep things as is.