User Details
- User Since
- Jan 6 2020, 12:19 PM (237 w, 4 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- HNowlan (WMF) [ Global Accounts ]
Wed, Jul 10
This appears to have been an issue with a previous version of librsvg, I've purged the caches for each of the affected images in this case and they now render correctly. If you notice any other similarly incorrect images, purging the cache will hopefully address it.
Tue, Jul 9
kubernetes* and mw* are ready
Mon, Jul 8
Confirming that this is an issue, seeing the same behaviour with a clean cache and locally with rsvg-convert. Either an issue with the file or a new issue with rsvg-convert.
Are there plans on when and how we need to move this forward to production?
Sat, Jul 6
It seems a bad frontend server was the source of these errors, and a rolling restart appears to have addressed this but we'll follow up during the week to see if there are any obvious causes.
Fri, Jun 28
The thumbor-side issues were a side-effect of the upgrade that our tests didn't catch due to differences in config between prod and test - I see the supplied image rendering now (I've purged the cache for that image to be sure). However, I am still seeing the Undefined error, which I suspect is either a long-tail side-effect of the Thumbor issue, or something unrelated.
Thu, Jun 27
I'm sure there'll be some tweaks further down the road, but this deployment has been created. Tracking further work in T356241
Jun 26 2024
I'm sure the developers would be best positioned to say whether anything has changed, but as far as the AQS services themselves are concerned it doesn't seem like there have been any significant increases in latency: service-level view, REST gateway level view (easiest to read if you filter out the proton metrics)
Jun 24 2024
Log level reduced to stop the bleeding:
Jun 21 2024
Jun 13 2024
I believe the straggling traffic here is a misnomer/a graph misunderstanding - the API gateway's envoy config refers to traffic to the mediawiki API as "mwapi_cluster" internally before and after the hostname was changed. I believe these requests are normal and are being routed to k8s already - drop mwapi_cluster from the graph and we're at zero! 🎉
I don't think this will be needed.
Jun 12 2024
This change now uses librsvg's accept-language flag to obey`lang`.
Jun 11 2024
Jun 10 2024
Thin lines are now rendering in the test cases given
Considering this resolved as part of T355020. Please reopen if that's incorrect
Not seeing this behaviour as fixed onwiki in librsvg 2.54.7 in place
Is this still an issue? As far as I can tell we have all available Bengali fonts installed that are currently available in Debian (along with the noto fonts from T184664) but SiyamRupali is not one of them. If there's an example of a broken SVG it'd be helpful to debug this further
The generated image post-purge/post-upgrade appears to render correctly for me, fixed in the last upgrade?
Jun 7 2024
This will be done with T355020
I agree this isn't necessarily a software issue. That said, since upgrades the 180px version of this image is no longer cut off.
https://gerrit.wikimedia.org/r/1039778 is now ready for review. In future we might need to revisit how we do our SSIM comparisons as regards reference thumbnails. As our tools change, the distances etc are diverging and there will come a point where tweaking the test values will become a problem.
Jun 6 2024
Appears resolved.
Appears resolved by librsvg upgrade
Resolving for now, following up in related issues.
This appears fixed
Appears fixed
Has this issue been fixed by the upgrade and the improved text-anchor behaviours?
I believe after purging SVGs are now obeying the font list. Not closing until this is confirmed
This appears to be resolved.
This specific issue appears resolved - I see some other unresolved issues in SVG_Test_TextAlign.svg but not pertinent to this issue.
This appears resolved by T265549
This appears to be fixed by T265549
Solved as of T265549
I believe this is solved by T265549
Since upgrading to 2.50.3 and doing some purges, I am seeing *some* improvement in the errors in this ticket but not in all cases.
We are using Thumbor on bullseye everywhere which means that SVGs will be rendered by 2.50.3. Keeping this task open for tracking issues for the moment.
May 30 2024
I am now seeing results when using queries with urlencoded characters. Unfortunately we will need to add a manual hack if there are other non-alphanumeric chars in other parts of the URL in future, but for now I think this works:
May 28 2024
It seems Envoy only normalises a subset of urlencoded characters:
hnowlan@plunkett ~/Code/deployment-charts (hnowlan/T365439-apigw_normalise_path_urls *) $ curl -s localhost:8087/core/v1/wikisource/a/%3A| grep original-path "x-envoy-original-path": "/core/v1/wikisource/a/%3A" hnowlan@plunkett ~/Code/deployment-charts (hnowlan/T365439-apigw_normalise_path_urls *) $ curl -s localhost:8087/core/v1/wikisource/a/%31| grep original-path "x-envoy-original-path": "/core/v1/wikisource/a/1"
May 27 2024
The normalisation change has unfortunately not fixed this issue - docs indicate that it should have but I suspect this is something to do with the use of regex matching as opposed to static matching. I'll try to come up with a workaround for the short term
May 23 2024
certs updated in all DCs, alerts resolved. I sincerely hope we will have the mesh migration resolved so we can avoid having to update echostore's certificates in October, but in case something prevents that and for reference the process was:
- puppet cert revoke sessionstore.discovery.wmnet
- In the puppet repo on your local checkout ./utils/create_ecdsa_cert sessionstore.discovery.wmnet sessionstore.svc.eqiad.wmnet sessionstore.svc.codfw.wmnet
- On the puppetmaster, put the contents of /var/lib/puppet/server/ssl/ca/signed/sessionstore.discovery.wmnet.pem into certs.kask.cert in helmfile.d
- Add the contents of the new private key from ./modules/secret/secrets/ssl/sessionstore.discovery.wmnet.key to hieradata/role/common/deployment_server/kubernetes.yaml
- Validate the files and make sure everything looks okay using openssl ec/openssl x509, then git commit your changes in private
- Follow the Helm rollout process as normal, keeping an eye on the sessionstore graphs and the session loss graphs
May 22 2024
Current situation - I have refreshed the .key file on the puppet master using a modified version of the create_ecdsa_cert script, and I have pushed the new key to the staging k8s secrets for sessionstore only. I've also updated the cert file in the helmfile configuration for sessionstore, but it hasn't picked it up because that part of the config wasn't checksummed to recreate the pods (fixed in this change). Tomorrow I will try to roll to codfw and, if successful, eqiad.
All codfw wikikube-ctrl nodes are operational
May 21 2024
I suspect the fix for this is a relatively small change on the API gateway, but the change is a global one so I will need to take some time to test this, even if the impact is to make things standards-compliant. Hoping to get to it later this week