Page MenuHomePhabricator

Peter (Peter Hedenskog)
Software Engineer, Wikimedia Foundation

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Aug 17 2015, 6:48 PM (397 w, 4 d)
Availability
Available
IRC Nick
phedenskog
LDAP User
Unknown
MediaWiki User
PHedenskog (WMF) [ Global Accounts ]

Recent Activity

Yesterday

Peter added a comment to T333524: Pin FFMPEG to specific CPU:s.

If we wanna try it out again:

Fri, Mar 31, 7:44 AM · Performance-Team
Peter added a comment to T330402: Unify how we run synthetic tests on mobile vs desktop.

The easiest way to fix this will be that the server that adds the job to BitBar, will check out the repo and the loop through tests and send them one by one. That way we can use that we have multiple phones for the same job and it should be easy to implement.

Fri, Mar 31, 7:29 AM · WebPageReplay, Performance-Device-Lab, Performance-Team
Peter closed T333524: Pin FFMPEG to specific CPU:s, a subtask of T311983: Setup tests on the new bare metal machine, as Resolved.
Fri, Mar 31, 7:08 AM · Performance-Team
Peter closed T333524: Pin FFMPEG to specific CPU:s as Resolved.

I tried this out yesterday. One thing is that you cannot pin different CPUs on different frequencies, so for example if if you have 8 CPUs and you want four of them to run on 1 GHz and four of them on 4GHZ that do not work. What I ended cup trying was to run four 1GHZ for Chrome and OS and four for 1GHZ for ffmpeg. I couldn't see any difference and maybe if need to try it some more to see if other configs would work. But for now I think is enough, lets focus on deploying everything on the bare metal server instead.

Fri, Mar 31, 7:08 AM · Performance-Team

Thu, Mar 30

Peter added a comment to T333524: Pin FFMPEG to specific CPU:s.

We have some tests that runs without Docker, so I'll try on those. Pinning within the container, well I don't know how that works.

Thu, Mar 30, 8:37 AM · Performance-Team
Peter closed T333008: Sync bare metal setup with AWS as Resolved.
Thu, Mar 30, 6:51 AM · WebPageReplay, Performance-Team
Peter added a comment to T333008: Sync bare metal setup with AWS.

Here's the graph of the instability over time:

becnhmark-over-time.jpg (1×2 px, 319 KB)

Thu, Mar 30, 6:50 AM · WebPageReplay, Performance-Team
Peter created T333524: Pin FFMPEG to specific CPU:s.
Thu, Mar 30, 6:04 AM · Performance-Team

Wed, Mar 29

Peter added a comment to T333008: Sync bare metal setup with AWS.

Maybe its to early to celebrate but the first test running with a new command line for 112 produces much less instability. These are the updates: https://github.com/sitespeedio/browsertime/pull/1921/files - I wonder if OptimizationHints could have been the problem? Some months ago (half a year) it started to popup requests back to the Google optimisation services. and I blocked that domain. Maybe the domain changed and it become more aggressive? Lets keep this change running until tomorrow.

Wed, Mar 29, 12:44 PM · WebPageReplay, Performance-Team

Tue, Mar 28

Peter added a comment to T333008: Sync bare metal setup with AWS.

I tried the 112 beta (stable 112 is released tomorrow) but no luck, it looks the same there.

Tue, Mar 28, 9:00 PM · WebPageReplay, Performance-Team
Peter added a comment to T333008: Sync bare metal setup with AWS.

I'm gonna verify that this problem also happens in Chrome 112. If that's the case I'll report it to Google, if not we can just roll forward.

Tue, Mar 28, 7:45 PM · WebPageReplay, Performance-Team
Peter added a comment to T333008: Sync bare metal setup with AWS.

I could see that the change is related to Chrome 111. I made sure that we run 110 and I could see benchmark and the first visual change is stable, then upgrading just to 111 made the benchmark metric more unstable. I check if I can see some difference in the Chrome performance log.

Tue, Mar 28, 4:21 PM · WebPageReplay, Performance-Team
Peter added a comment to T333008: Sync bare metal setup with AWS.

I could verify that it's not the configuration. Gonna try with Chrome 110 first and see how that works.

Tue, Mar 28, 11:38 AM · WebPageReplay, Performance-Team
Peter reopened T332868: Grant Grafana access to babiola as "Open".

Hmm maybe something needs all to be done on the Grafana side? When @BAbiola-WMF tries to login to Grafana she gets 407:Proxy Authentication Required or UNEXPECTED_PROXY_AUTH

Tue, Mar 28, 11:29 AM · SRE, LDAP-Access-Requests
Peter created T333320: Create a Slack bot to run performance tests.
Tue, Mar 28, 9:42 AM · Performance-Team-onboarding, Performance-Team, Performance-Device-Lab
Peter added a comment to T333008: Sync bare metal setup with AWS.

I switched back to the old configuration and it worked good again. However now when I checked I could see that we run different Chrome versions and different sitespeed.io versions, so it's maybe not a configuration issue, either browser or somehow the code?

Tue, Mar 28, 7:07 AM · WebPageReplay, Performance-Team

Mon, Mar 27

Peter updated subscribers of T332859: Higher histogram limits reached in South Africa for 75 p.

I think you are right @Krinkle let me increase the time span and the old will probably just work.

Mon, Mar 27, 6:41 PM · Patch-For-Review, NavigationTiming, Performance-Team
Peter moved T333008: Sync bare metal setup with AWS from Inbox, needs triage to Doing: Prio Interrupt on the Performance-Team board.
Mon, Mar 27, 6:23 PM · WebPageReplay, Performance-Team
Peter moved T326718: WebDriverError: Failed to decode response from marionette errors for Firefox from Doing: Prio Interrupt to Backlog: Maintenance, non-prioritized on the Performance-Team board.
Mon, Mar 27, 6:20 PM · Performance-Team, Performance-Device-Lab
Peter closed T326118: No navigation metrics in Prometheus since 27/12-2022 as Resolved.

This works better now.

Mon, Mar 27, 6:20 PM · Patch-For-Review, NavigationTiming, Performance-Team
Peter closed T327472: BitBar test failed on installation as Resolved.
Mon, Mar 27, 6:19 PM · Performance-Team, Performance-Device-Lab
Peter closed T331291: New type of error for mobile login performance tests as Resolved.

See https://phabricator.wikimedia.org/T331994

Mon, Mar 27, 6:18 PM · Performance-Team, Performance-Device-Lab
Peter closed T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer" as Resolved.

I think this was because of login issues when changing dc:s

Mon, Mar 27, 6:17 PM · Performance-Team, Performance-Device-Lab
Peter added a comment to T333008: Sync bare metal setup with AWS.

What seems strange here though is that the FFMPEG process should end before the CPU benchmark is collected, so maybe we have multiple problems.

Mon, Mar 27, 2:19 AM · WebPageReplay, Performance-Team

Sun, Mar 26

Peter added a comment to T333008: Sync bare metal setup with AWS.

To be sure I just disabled the viewport change. If the benchmark gets back as default, I'm gonna try that FFMPEG fix.

Sun, Mar 26, 8:27 PM · WebPageReplay, Performance-Team
Peter added a comment to T333008: Sync bare metal setup with AWS.

I think the problem is that we test with a larger viewport than the default setting. Running with the AWS setting I can see that we have a much more unstable CPU benchmark:

Sun, Mar 26, 8:26 PM · WebPageReplay, Performance-Team

Fri, Mar 24

Peter created T333008: Sync bare metal setup with AWS.
Fri, Mar 24, 3:29 PM · WebPageReplay, Performance-Team
Peter added a comment to T332970: query.wikidata.org/querybuilder tests on WebPageReplay have errors.

I could reproduce it locally, let me have a go next week to fix it.

Fri, Mar 24, 8:16 AM · WebPageReplay, Performance-Team
Peter created T332970: query.wikidata.org/querybuilder tests on WebPageReplay have errors.
Fri, Mar 24, 6:57 AM · WebPageReplay, Performance-Team
Peter closed T332969: Vue vs legacy search tests stopped working March 13 as Resolved.
Fri, Mar 24, 6:53 AM · Performance-Team
Peter created T332969: Vue vs legacy search tests stopped working March 13.
Fri, Mar 24, 6:34 AM · Performance-Team

Thu, Mar 23

Peter created T332868: Grant Grafana access to babiola.
Thu, Mar 23, 10:47 AM · SRE, LDAP-Access-Requests
Peter added a project to T332859: Higher histogram limits reached in South Africa for 75 p: NavigationTiming.
Thu, Mar 23, 7:47 AM · Patch-For-Review, NavigationTiming, Performance-Team
Peter created T332859: Higher histogram limits reached in South Africa for 75 p.
Thu, Mar 23, 7:47 AM · Patch-For-Review, NavigationTiming, Performance-Team
Peter closed T332857: Disable auto-install of new snaps on the AWS machines as Resolved.

I've disabled it with

Thu, Mar 23, 6:49 AM · Performance-Team
Peter created T332857: Disable auto-install of new snaps on the AWS machines.
Thu, Mar 23, 6:46 AM · Performance-Team

Wed, Mar 22

Peter added a comment to T311980: Move tests from AWS to bare metal.

Today I turned on all WebPageReplay tests that we run AWS so they also run on bare metal. I used the exact same configuration except that I hacked the start script on the bare metal server to change the Graphite reporting key, so it reports under baremetal. If this looks ok, I think this is a good first step, then we can move WebPageReplay tests to the bare metal server and turn off a couple of AWS servers.

Wed, Mar 22, 11:12 AM · Performance-Team

Tue, Mar 21

Peter updated the task description for T332012: Collect first input delay.
Tue, Mar 21, 2:54 PM · Patch-For-Review, NavigationTiming, Performance-Team
Peter added a comment to T307984: Sync TTFB/CPU benchmark for synthetic tests with data from the Chrome User Experience report and our own reporting.

Looking at the CPU benchmark in our testing infrastructure:

Tue, Mar 21, 8:30 AM · Performance-Team
Peter added a comment to T307984: Sync TTFB/CPU benchmark for synthetic tests with data from the Chrome User Experience report and our own reporting.

I looked at the CPU benchmark for India. The 75 percentile for mobile is 285 ms (span 259 -352) and 95 percentile 466 (span 394 - 544). For desktop in India the mean 75 p is 147 ms and the 95 p is 349 ms.

Tue, Mar 21, 8:27 AM · Performance-Team
Peter renamed T307984: Sync TTFB/CPU benchmark for synthetic tests with data from the Chrome User Experience report and our own reporting from Sync TTFB for synthetic tests with data from the Chrome User Experience report to Sync TTFB/CPU benchmark for synthetic tests with data from the Chrome User Experience report and our own reporting.
Tue, Mar 21, 8:25 AM · Performance-Team

Mon, Mar 20

Peter added a comment to T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".

Our user gets this error:

Mon, Mar 20, 7:13 PM · Performance-Team, Performance-Device-Lab
Peter added a comment to T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop.

Interesting: T299886 and https://bugs.chromium.org/p/chromium/issues/detail?id=1291502#c63

Mon, Mar 20, 8:04 AM · WebPageReplay, Performance-Team
Peter created T332537: LCP from video and API is a missmatch.
Mon, Mar 20, 7:58 AM · Performance-Team (Radar), Upstream, WebPageReplay

Fri, Mar 17

Peter updated subscribers of T325282: Update Grafana alerts to use metrics from Prometheus.

I think I need your help here @Krinkle - I've been looking at some metric and having a hard time to know exactly how we should move on. Your trick with max_over_time doesn't work on histograms I think?

Fri, Mar 17, 3:18 PM · Performance-Team
Peter added a comment to T331291: New type of error for mobile login performance tests.

One of the problems here that the login gets stuck and not redirected to the main page, instead we get this:

Screenshot 2023-03-17 at 07.36.00.png (482×1 px, 50 KB)

Fri, Mar 17, 12:37 PM · Performance-Team, Performance-Device-Lab
Peter added a comment to T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".

Ok, Chrome and Firefox errors are fixed. There's still one error for the login user journey, but it do not happen all the time. In the visual metrics script it comes down to a error that looks like:

Fri, Mar 17, 6:23 AM · Performance-Team, Performance-Device-Lab

Wed, Mar 15

Peter added a comment to T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".

WebPageReplay tests on Chrome works.. For Firefox the install script works on my Samsung, so lets how I can debug what's going on at BitBar.

Wed, Mar 15, 12:11 PM · Performance-Team, Performance-Device-Lab
Peter added a comment to T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".

For Firefox the install somehow fails. The output from adb install:

Wed, Mar 15, 10:54 AM · Performance-Team, Performance-Device-Lab
Peter added a comment to T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".

Yesterday night I manually updated to Chrome 111 and then this morning BitBar fixed the connection problem. There are two kind of problems left to fix:

Wed, Mar 15, 7:50 AM · Performance-Team, Performance-Device-Lab

Tue, Mar 14

Peter added a comment to T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".

Updated sitespeed.io that included Chromedriver 110 but we still get a lot of errors (Chrome on some phones are still 110):

Tue, Mar 14, 5:52 PM · Performance-Team, Performance-Device-Lab
Peter updated the task description for T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop.
Tue, Mar 14, 4:00 PM · WebPageReplay, Performance-Team
Peter created T332020: Do a follow up blog post to the "why performance matters" post.
Tue, Mar 14, 2:55 PM · Performance-Team
Peter added a comment to T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop.

So it turns out I was running different configuration on AWS and the bare metal. On AWS we have disabled PaintHoldingCrossOrigin, but that is fixed now, lets see if we can see any result.

Tue, Mar 14, 2:36 PM · WebPageReplay, Performance-Team
Peter updated the task description for T332012: Collect first input delay.
Tue, Mar 14, 2:19 PM · Patch-For-Review, NavigationTiming, Performance-Team
Peter created T332012: Collect first input delay.
Tue, Mar 14, 2:19 PM · Patch-For-Review, NavigationTiming, Performance-Team
Peter created T331994: New type of errors for BitBar tests: "Timed out receiving message from renderer".
Tue, Mar 14, 12:48 PM · Performance-Team, Performance-Device-Lab
Peter closed T327660: Run WebPageReplay tests on Moto G5 at BitBar as Declined.

We got a new Samsung A51 so we can set them up to run two as one, lets do that instead of spend time testing it out in a Moto G5,

Tue, Mar 14, 12:13 PM · Performance-Device-Lab, Performance-Team
Peter closed T327660: Run WebPageReplay tests on Moto G5 at BitBar, a subtask of T327651: Re-think and change how we run our tests at BitBar, as Declined.
Tue, Mar 14, 12:13 PM · Performance-Device-Lab, Performance-Team
Peter added a comment to T325283: Update dashboards to use Prometheus metrics.

Today we have an issue for users with TTFB and I just verified that we can see the same thing with our Prometheus metrics. Looking at TTFB 75p and compare with one/two weeks back:

Screenshot 2023-03-14 at 09.05.40.png (862×2 px, 428 KB)

Tue, Mar 14, 8:08 AM · Performance-Team
Peter triaged T331963: Make it possible to run different CPU speed on the mobile phones as Low priority.
Tue, Mar 14, 7:51 AM · Performance-Team, Performance-Device-Lab
Peter created T331963: Make it possible to run different CPU speed on the mobile phones.
Tue, Mar 14, 7:51 AM · Performance-Team, Performance-Device-Lab
Peter added a comment to T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop.

First let's have a look at our RUM data for First Contentful Paint for 110 and 111 looking at p75:

Screenshot 2023-03-14 at 08.32.10.png (1×2 px, 440 KB)

Screenshot 2023-03-14 at 08.31.57.png (1×2 px, 413 KB)

Tue, Mar 14, 7:39 AM · WebPageReplay, Performance-Team

Mon, Mar 13

Peter moved T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop from Inbox, needs triage to Doing: Prio Interrupt on the Performance-Team board.
Mon, Mar 13, 7:58 PM · WebPageReplay, Performance-Team
Peter added a comment to T330333: Reliable measure how fast a Wikipedia article would be without JavaScript.

Under the key minJS I pushed where we add minimal JS. Then in the key JS we test the same URL with the same script except changing the resource loader to make our comparison more fair.

Mon, Mar 13, 11:53 AM · Performance-Team
Peter added a comment to T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop.

I've pushed the changes so we run Chrome 110 in all our testing and by tonight I'll push 111 and then we can see tomorrow what it looks like.

Mon, Mar 13, 11:38 AM · WebPageReplay, Performance-Team
Peter created T331845: Investigate: Chrome 111 seems to increase the performance on LCP and FCP on first view desktop.
Mon, Mar 13, 7:04 AM · WebPageReplay, Performance-Team

Fri, Mar 10

Peter added a comment to T330333: Reliable measure how fast a Wikipedia article would be without JavaScript.

I pushed to test on the new bare metal server where we use the same code, except that for one we don't change the RLPAGEMODULES. That way we almost get the same TTFB. I'll keep the test running during the weekend and the we can have a look at the differences. It's a little hard to see exactly since we are running "wiki loves".

Fri, Mar 10, 3:04 PM · Performance-Team
Peter added a comment to T330333: Reliable measure how fast a Wikipedia article would be without JavaScript.

I got the hack number 3 to work today, but it adds a couple of 100 ms on TTFB, so I need make another version we do the exact same thing in a script (like getting the HTML body) but do not change the HTML, hopefully that can give me a good base line. Also one problem on the desktop version is that when I removed all JS, it always rendered only with the header (that started to happen with vector-2022), then it's harder to verify that some metric stay the same.

Fri, Mar 10, 1:06 PM · Performance-Team

Thu, Mar 9

Peter added a comment to T311983: Setup tests on the new bare metal machine.

I added the same URLs to run with WebPageReplay, then tomorrow I will sync CPU benchmark/TTFB and document it so we can motivate how and why we run with the configuration.

Thu, Mar 9, 11:03 AM · Performance-Team

Wed, Mar 8

Peter added a comment to T311983: Setup tests on the new bare metal machine.

I've setup three URLS that we test once every hour and gonna let that run until tomorrow to just collect some the data. Then look at the CPU and TTFB and sync if they should be changed.

Wed, Mar 8, 1:19 PM · Performance-Team
Peter added a comment to T317887: Upgrade to Grafana 9.

I've gone through all alerts on our side and made sure they do not fire (or at least I think I fixed them all), However all of them that fired are still stuck in the old state or "no data" state. However running the alerts preview I can see that they get data and are not firing.

Wed, Mar 8, 1:02 PM · SRE Observability (FY2022/2023-Q3), Observability-Metrics
Peter added a comment to T299886: Upgrade to Chrome 97 increased first visual change metrics for synthetic testing.

I've been starting to setup the bare metal server in T311983 and this bug is problematic (it always have but its even more now): If we run with default settings we still have the discrepancy between RUM metrics and visual metrics. If I turn off the change (disable-features=PaintHoldingCrossOrigin) it looks like vector-2022 introduced a new way of rendering, wherre sometimes Chrome chooses to render the search bar first. Compare these two:

Screenshot 2023-03-08 at 13.09.10.png (876×2 px, 447 KB)

Screenshot 2023-03-08 at 13.09.27.png (820×2 px, 631 KB)

Wed, Mar 8, 12:14 PM · Performance-Team (Radar), Upstream, WebPageReplay
Peter added a comment to T311983: Setup tests on the new bare metal machine.

I did some testing: https://s3.amazonaws.com/synthetic-tests-result-wikimedia/firstView/2023-03-08-10-07-52/pages/en_wikipedia_org/wiki/Barack_Obama/metrics.html

Wed, Mar 8, 10:22 AM · Performance-Team
Peter added a comment to T317887: Upgrade to Grafana 9.

I also have problems with other alerts. There were to alerts in https://grafana-rw.wikimedia.org/d/000000318/browsertime-alerts that correctly fired because the limits where hit. I increased the limits and run the alert queries in the GUI clicking on the preview button:

Screenshot 2023-03-08 at 09.55.36.png (788×2 px, 162 KB)

Wed, Mar 8, 8:58 AM · SRE Observability (FY2022/2023-Q3), Observability-Metrics
Peter added a comment to T317887: Upgrade to Grafana 9.

I think something else broke with the 9 upgrade with the alerts. I checked one alert and I think somehow the query got corrupted when we converted it:

Wed, Mar 8, 8:29 AM · SRE Observability (FY2022/2023-Q3), Observability-Metrics

Tue, Mar 7

Peter assigned T327246: Record interaction to next paint to larissagaulia.
Tue, Mar 7, 1:38 PM · Performance-Team-onboarding, NavigationTiming, Performance-Team
Peter added a comment to T327246: Record interaction to next paint.

We can follow the pattern on how we implemented First Input delay in https://phabricator.wikimedia.org/T238091

Tue, Mar 7, 1:38 PM · Performance-Team-onboarding, NavigationTiming, Performance-Team

Mon, Mar 6

Peter moved T330333: Reliable measure how fast a Wikipedia article would be without JavaScript from Inbox, needs triage to To-do: Goals, prioritized next 4 Quarters on the Performance-Team board.
Mon, Mar 6, 7:47 PM · Performance-Team
Peter moved T330402: Unify how we run synthetic tests on mobile vs desktop from Inbox, needs triage to To-do: Goals, prioritized next 4 Quarters on the Performance-Team board.
Mon, Mar 6, 7:45 PM · WebPageReplay, Performance-Device-Lab, Performance-Team
Peter moved T331261: Fix CPU benchmark dashboard for multiple datacenters from Inbox, needs triage to Backlog: Maintenance, non-prioritized on the Performance-Team board.
Mon, Mar 6, 7:36 PM · Performance-Team
Peter added a comment to T330333: Reliable measure how fast a Wikipedia article would be without JavaScript.

Great, I will try to add that the coming weeks, we need to upgrade the version of browsertime/sitespeed.io on BitBar to be able to mock with the content.

Mon, Mar 6, 1:19 PM · Performance-Team
Peter created T331291: New type of error for mobile login performance tests.
Mon, Mar 6, 1:17 PM · Performance-Team, Performance-Device-Lab
Peter added a comment to T264032: Record long tasks in navtiming.

I think we should continue with the long tasks but also know that https://github.com/w3c/longtasks/blob/loaf-explainer/loaf-explainer.md implementation is coming. Firefox has stopped their implementation to wait for what happens with loaf.

Mon, Mar 6, 12:34 PM · MW-1.41-notes (1.41.0-wmf.2; 2023-03-27), Performance-Team-onboarding, NavigationTiming, Performance-Team
Peter closed T331260: Desktop and emulated mobile synthetic tests stopped working 5/3 as Resolved.

We were running out of disk, we have some snaps that automatically gets installed. I've removed as much as possible than I'll focus on move to bare metal so we can rid off this problem.

Mon, Mar 6, 8:44 AM · Performance-Team
Peter created T331261: Fix CPU benchmark dashboard for multiple datacenters.
Mon, Mar 6, 6:45 AM · Performance-Team
Peter created T331260: Desktop and emulated mobile synthetic tests stopped working 5/3.
Mon, Mar 6, 6:41 AM · Performance-Team

Feb 24 2023

Peter added a comment to T311981: Set up bare metal server at Hetzner for performance tests.

This is done and documented in https://wikitech.wikimedia.org/wiki/Performance/Synthetic_testing/Bare_metal

Feb 24 2023, 9:11 AM · Performance-Team-onboarding, Performance-Team

Feb 23 2023

Peter created T330402: Unify how we run synthetic tests on mobile vs desktop.
Feb 23 2023, 1:59 PM · WebPageReplay, Performance-Device-Lab, Performance-Team

Feb 22 2023

Peter created T330333: Reliable measure how fast a Wikipedia article would be without JavaScript.
Feb 22 2023, 8:25 PM · Performance-Team
Peter added a comment to T317887: Upgrade to Grafana 9.

Hi! I wonder if we need to do something on our side with the alerts in 9? We have a lot of fired alerts with "No data" but I can see that the data is there (or at least in the same interval as before), so I think there could a difference in how "No data" is handled between the last and the current version, do you know?

Feb 22 2023, 8:19 AM · SRE Observability (FY2022/2023-Q3), Observability-Metrics

Feb 21 2023

Peter added a comment to T328917: Custom created annotations created in the GUI do not show up in the Grafana graph.

Hmm, I added tags when I created the annotation (sorry for not including that in the screenshot). But I see, I tried on another dashboard that do not load any annotations and there it works. There's another bug we have with multiple datasources https://github.com/grafana/grafana/issues/46440 for annotations, maybe its related.

Feb 21 2023, 10:20 AM · Observability-Metrics
Peter moved T330055: Make it easier to add URLs to the first view tests on BitBar from Backlog: Maintenance, non-prioritized to To-do: Goals, prioritized next 4 Quarters on the Performance-Team board.
Feb 21 2023, 8:41 AM · Performance-Team, Performance-Device-Lab
Peter moved T330055: Make it easier to add URLs to the first view tests on BitBar from Doing: Goals to Backlog: Maintenance, non-prioritized on the Performance-Team board.
Feb 21 2023, 8:40 AM · Performance-Team, Performance-Device-Lab
Peter added a comment to T330055: Make it easier to add URLs to the first view tests on BitBar.

I actually did a quick fix solving the issue for now and then we can put this on the backlog and can take this on later.

Feb 21 2023, 8:40 AM · Performance-Team, Performance-Device-Lab

Feb 20 2023

Peter created T330055: Make it easier to add URLs to the first view tests on BitBar.
Feb 20 2023, 9:06 AM · Performance-Team, Performance-Device-Lab

Feb 9 2023

Peter closed T329255: Tested stopped working wpr-mobile.wmftest.org 2023-02-09 as Resolved.

I killed the container and restarted everything and after while the tests started to run again.

Feb 9 2023, 7:32 AM · Performance-Team, WebPageReplay
Peter created T329255: Tested stopped working wpr-mobile.wmftest.org 2023-02-09.
Feb 9 2023, 7:31 AM · Performance-Team, WebPageReplay

Feb 6 2023

Peter added a comment to T327472: BitBar test failed on installation.

Hmm this started to happen again, the error we get is:

Feb 6 2023, 6:46 PM · Performance-Team, Performance-Device-Lab
Peter created T328917: Custom created annotations created in the GUI do not show up in the Grafana graph.
Feb 6 2023, 1:11 PM · Observability-Metrics