Looks like a CentralAuth-related issue - tagging accordingly.
Sat, Dec 2
Fri, Dec 1
Wed, Nov 29
::getTemplateData was called twice and we already fixed that. That previous bug fixed - now this code is not as prominent (previously it was 4 times and bigger "All instances" time).
Links to profiler/flamegrapsh:
Special:BlankPage - https://performance.wikimedia.org/excimer/profile/dd2f547999078ddb
Featured Article - Interstate_40_in_Tennnessee - https://performance.wikimedia.org/excimer/profile/17c43b402fb432bb
Special:History for featured article - https://performance.wikimedia.org/excimer/profile/e4eb9b4ac30c4654.
Special:Login with Excimer only - https://performance.wikimedia.org/excimer/profile/33bd9baa56e1ddbe
Special:Login with xhprof data: https://performance.wikimedia.org/excimer/profile/eeba2c8d7eb63c29
Tue, Nov 28
I did a quick research here and looks like the root cause is the uploaded PNG file. The error happens when the FormatMetadata::makeFormattedData() tries to process the Flash property from metadata. It fails because in DB we stored an empty string instead of a numeric value and PHP doesn't know how to perform a bit operation between string and int.
Mon, Nov 27
Profiler files for possible future analysis:
I reviewed multiple flame graphs where the entry point is index.php and I found one low-hanging fruit which we already solved and most likely improved the overall execution by ~5%. But it's difficult to spot thigs that could be improved with small fixes. The execution overall is pretty slow to the amount of running parts in our system and to improve the overall time most likely we should simplify things, and reduce the amount of extensions/hooks we use for common requests.
Wed, Nov 22
A follow-up ticket was created - closing this one. @Krinkle thank you for your help on this one.
Mon, Nov 20
It comes from CentralAuth - I'm going to assign MediaWiki-Platform-Team as we're currently working on this extension.
for length we would have Less_Tree_Expression and Less_Tree_Value at least. Those are the ones that are passed via our testing suite. I applied the change, checked the logic and looks ok. I'll push the PR for it.
Mon, Nov 6
Oct 30 2023
Let me move this to "Blocked/waiting" on the Platform Team Board. We need the prod/beta URL template to allow links to generated traces.
Oct 27 2023
I analysed the possibility of dropping the new interface by using specific types instead. I found a couple of places we can easily fix. But I also found some methods that return type can be anything of Less_Tree, or one of multiple return types.
Oct 25 2023
Nice @hashar! Thanks for cleaning it up.
Oct 23 2023
@Ladsgroup - I'd like to ask for your help here - We have another ticket where we need to investigate the DB deadlocks - those happen from time to time in different parts of the MediaWiki. Do you know any possible solution (maybe the pt-deadlock-logger) that could help us debug those problems? It's not possible (most of cases) to reproduce those locally and trying to find out what caused those on prod is very tricky and usually fruitless.
Oct 20 2023
@CDanis FYI: when the extension is enabled for the current request - it adds the header X-Wikimedia-Debug. This header value is a list of attributes concatenated with ;.
After a conversation with @CDanis, we clarified that this ticket is about modifying the WikimediaDebug extension to add a new checkbox to trigger the tracing on the edge. I'll update the ticket description and fill in all necessary information.
Oct 19 2023
Oct 16 2023
@Tgr Most likely we can resolve this ticket as the gerrit patch is merged. What do you think about the proposed follow-up to add the email_verified field?.
Oct 10 2023
I'm trying to find some good specifications on what we can expect from profile information and it looks like openid in StandardClaims specifies both email and email_verified (https://openid.net/specs/openid-connect-core-1_0.html#StandardClaims)
I tried to find such specifications for OAuth1.0/OAuth2.0 too - but I didn't have much luck - I found references to OpenId Connect in "OAuth 2.0 Rich Authorization Requests" which specifies email and email_verified too https://www.rfc-editor.org/rfc/rfc9396#fig26 (because it follows the OpenId Standard Claim rules).
Sep 20 2023
Sep 18 2023
Sep 14 2023
I pushed the proposed change marked as DNM - @Dreamy_Jazz please let me know if this is what we're looking for. I haven't tested it yet, other than just running tests locally.
In c) I meant to skip adding this row entirely, and not add anything other. Yes, sorry, I messed up when copying/pasting. I didn't know is it safe to insert data to other table in such case as I don't know the big picture of what CheckUser is doing.
I see four possible solutions here:
So after investigating the code - We have a RecentChange entry, that contains information about the change in our system. RecentChange has an attribute rc_logid -> which is the log entry associated with given change. It can be 0 -> no log entry set, or a integer -> then it's a foreign_key to logging table.
Sep 13 2023
@CDanis could you elaborate a little bit on this one? What do you mean "force a distributed tracing" ?
@pmiazga I was able to find it (I think) at https://logstash.wikimedia.org/app/dashboards#/doc/logstash-*/logstash-deploy-1-7.0.0-1-2023.09.04?id=xfyrX4oBi6TOjMqmYTZz
Sep 12 2023
I found another log statement exactly from the same line. The previous error comes from fixStuckGlobalRename but the one I found comes from SpamBlacklist: https://logstash.wikimedia.org/app/dashboards#/view/c7013c90-a487-11ec-be91-b3435f0c0c49?_g=h@53fc073&_a=h@8974712 and was triggered on EditPage.
We went through a set of reviews and I'm expecting this work to be merged and this ticket to be closed this week.
Sep 11 2023
Sep 8 2023
Sep 7 2023
Today on codemob we decided to move Telemetry class to includes/libs folder and make it generic. But when doing that, I found that the Telemetry cannot be global singleton, as it uses the MediaWiki config: https://github.com/wikimedia/mediawiki/blob/5405fdccfcb7fec8c0fd9b9c9c2a73ee18d98127/includes/http/Telemetry.php#L60
Sep 1 2023
I cannot reproduce it anymore nor I don't see any similar logs. This error happened twice on on 2023-08-10 after each command:
Aug 31 2023
Sorry for a bit of silence in this ticket, Recently I focused a little bit more on the T344926 issue which caused us lots of trouble due to layering issues (RESTBagOStuff and MultiHttpClient are part of general libs but Telemetry is MediaWiki specific).
We spoke about those changes today on Codemob and we agreed on:
Aug 30 2023
Unless CDanis says there's urgency, I vote for skipping the temporary fix, as I'm not convinced the temporary fix is as simple as we'd hoped.
It's pretty straightforward: https://gerrit.wikimedia.org/r/953689 but I agree - I don't like it as it introduces the layering issue - the RESTBagOStuff shouldn't know anything about Telemetry/Tracing - it's the HTTPClient (whatever client we use) responsibility, not the RESTBagOStuff.
After a pairing session with @DAlangi_WMF we decided that the best way to tackle this :
Aug 29 2023
This is caused by MultiHttpClient using raw curl_*() calls instead HttpRequestFactory to get the HTTP client. On Prod env sessions use kask-session object cache, which uses RESTBagOStuff. The RESTBagOStuff could use a specific HTTP client - but we do not specify one via config (https://github.com/wikimedia/operations-mediawiki-config/blob/811b3dad11ea1cb629e06f71d94ab4e7208ccdfd/wmf-config/CommonSettings.php#L596C18-L596C3), therefore it fallback to default MultiHttpClient library to send requests (https://github.com/wikimedia/mediawiki/blob/f071c22a9a3e7e399dcf3256c96a952f15291a69/includes/libs/objectcache/RESTBagOStuff.php#L149). I also noticed it can support only MultiHttpClient and we cannot pass anything that extends MWHttpRequest which supports telemetry headers.
Aug 28 2023
T344926 is a standalone ticket as it's an existing issue. From a quick check, sessions can use RESTBagOStuff to store/retrieve session data. When setting up a RESTBagOStuff we can specify a custom HTTP client, or it will use MultiHttpClient by default.
Aug 22 2023
For quick hacking I'm going to use the OpenTelemetry monorepo - https://github.com/open-telemetry/opentelemetry-php as the one from mszabo (https://github.com/mszabo-wikia/opentelemetry-php) seems a bit outdated.
Aug 21 2023
@Dreamy_Jazz - I quickly checked this code and I found that this is caused by this line