Page MenuHomePhabricator

[MEX] [M5] [SPIKE] Investigate enabling Cypress video recording for browser tests
Open, Needs TriagePublic

Description

NOTE: Task requirements were implemented and it will remain open for further performance observation based on the data provided.

The build times (19m 44s and 22m 06s) are comparable to the build times without the recordings - it doesn't seem to be a massive performance hit. But we would need to have more data to be sure about exactly what the impact is.


In our January 2026 retrospective, we agreed that having video recordings of failed Cypress runs would be very helpful to investigate failures. We should investigate:

  • if / how we can enable video recording in CI, such that the videos are available in the test artifacts (we probably don’t need videos to be recorded locally)
  • what the performance impact of recording video would be (see also T415170) – @mahmoud.abdelsattar.wmde thought it could have a significant negative impact

Documentation: Capture Screenshots and Videos. See also wdio-mediawiki configuration.

Timebox: 8 hrs

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
ResolvedkarapayneWMDE
DeclinedNone
DeclinedNone
OpenNone
DeclinedNone
Resolvedmahmoud.abdelsattar.wmde
Resolvedmahmoud.abdelsattar.wmde
ResolvedNone
DeclinedNone
DeclinedNone
Resolvedmahmoud.abdelsattar.wmde
Resolvedmahmoud.abdelsattar.wmde

Event Timeline

Change #1243718 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[mediawiki/extensions/Wikibase@master] Enable video recordings for cypress tests

https://gerrit.wikimedia.org/r/1243718

The attached patch results in videos being saved to the build artefacts for the cypress tests:

The build times (19m 44s and 22m 06s) are comparable to the build times without the recordings - it doesn't seem to be a massive performance hit. But we would need to have more data to be sure about exactly what the impact is.

Change #1243718 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Enable video recordings for cypress tests

https://gerrit.wikimedia.org/r/1243718

Shouldn’t the investigation remain open until we’ve assessed the performance impact?

@mahmoud.abdelsattar.wmde Is this ticket now a ticket about gathering that data then? If so we should probably update the description or create a subtask and re-groom it. Or what is the plan to gather job performance data?

@ArthurTaylor I don't think creating a sub task is necessary, we are just trying to observe the performance impact after implementing the video recording in the CI.
If there is a significant impact to the CI build times caused by the current task, we can create a sub task for the new adjustment but based on the observations from the other builds as well.
But, you are right, the description also should be updated.

How will we notice if the build times are impacted? Do we have data about what the average (mean) build time (and variance?) were before we made this change?

The build times (19m 44s and 22m 06s) are comparable to the build times without the recordings - it doesn't seem to be a massive performance hit. But we would need to have more data to be sure about exactly what the impact is.

Yes, based on the data you provided earlier in one of the previous comments.
The impact will be noticed from the other patches build times.
I don't expect this would be systematic, just a general build observation will do.

Then I don't understand. We already have a build observation - that's the observation I made in the comment (19m 44s and 22m 06s). If we are not going to do a rigorous analysis, what additional observation do we need?

I think more CI build data would be useful to make sure it is stable.
Since we also introduced some additional logic for the videos recording (deleting the successful tests videos while keeping the failed ones) I believe we could check more builds to validate properly.
If you believe it is stable and it is good to go as the result of the investigation, we can resolve the task.

I don't think it's stable. I think the build time varies according to the load on the CI servers.

So what is the plan to collect more build data? Is that a task that we are going to assign to someone? How many observations should they collect? What will they compare their observations to?

Sure, the plan is to collect build times (and observe the failed test videos and confirm that the functionality works on the failed ones), this could be assigned to anyone (possible as a follow-up task or within the current task since it is an investigation task which I recommend).
If we observe a consistent shift beyond the normal spread compared to the baseline, we'll treat that as a real runtime impact and adjust accordingly .. otherwise, we’ll consider it CI noise.
The recommended amounts of the builds are at least 15-30 CI builds or more from or after the date of merge.
The data should be reasonably compared to the posted CI build data (in the comments).

okay. And what do we think the baseline is? How will we collect data about the baseline if the patch has already been merged?

The baseline would be the historical runtimes of the same CI job(s) before the merge. Even though the patch is already merged, we can still retrieve (I think) pre-merge durations from Jenkins build history. I don't think it will be going below the baseline, but that would be the perfect scenario.

Hi in the Test Platform team we are working on T420590 to decrease the feedback time from CI for developers (specific for the mediawiki/core jobs).

And I think adding videos for all tests increase the feedback time? I could fully understand in this task. Or do we say it's not adding overhead/make things slower? How does Cypress record a video, does it use FFMPEG or use Chrome tracing and cut out screenshots and create a video?

I want to make sure that the new Wikibase job that runs for core is below 10 minutes in run time. The median for March is 08:59 so far.

If the change is already is merged you can use the link I added for the median in March to dig into the numbers and see if there's a change. Let me know if you need any help.

Is it possible to either record videos for retries or making it easy to enable them when you actually have failing test? Cypress test is by default slow in CI since we do not run them in parallel and adding extra overhead is something I would like us to avoid.

Hi @Peter,

Thanks for your work speeding up the tests - it's very much appreciated!

When we enabled recordings for Cypress, we did understand that there would be a performance hit, and I have to admit we don't know exactly what the hit is. I think it does use FFMPEG - we deliberately disabled compression on the videos to minimise the performance hit.

It's a bit difficult to measure the performance hit from the video change in isolation - we've been extending the test suite during that time, and also making other performance optimisations to reduce the runtime. If you have any other data about the performance hit for video recording we would be very interested to have it as that would inform our choices here.

It might be possible just to record videos for retries, but at least the documentation I've seen only describes techniques for deleting videos when specs pass - I didn't find anything there about only enabling recordings for retries.

Noting that part of our motivation for turning on video recordings was painful experiences with flaky tests that seemed tough-to-impossible to debug from the final screenshot alone. So I don’t think β€œmaking it easy to enable [video recordings]” when you actually have failing test[s]” would work very well, because it would mean we mostly don’t have recordings for the flaky tests. On the other hand, recording videos for retries would be fine, I think (either the retry still fails and we have a useful video, or it passes and then we don’t really care because it didn’t block CI).

Arian_Bozorg renamed this task from [MEX] Investigate enabling Cypress video recording for browser tests to [MEX] [M5] [SPIKE] Investigate enabling Cypress video recording for browser tests.Apr 7 2026, 9:00 AM
Arian_Bozorg updated the task description. (Show Details)

Given that the scope of this ticket was delivered on and that generally this ticket has a lot of conversation in it, I've broken out the next step to a new subtask T425542: [MEX] Enable Cypress video recording for browser tests on retry.