I work on the MediaWiki Security Team.
Wed, Oct 18
I feel like the intent of the code of conduct is that in the event of a conflict, it is considered superior to any local policies
Could we maybe just have a new separate view that contains user_id, user_name and user_is_emailable where user_is_emailable is a boolean field that checks that disablemail is not on and use email is authenticated?
Umm, which page creation log specificly do you mean? Can you give a url to where the log used to be?
Mon, Oct 16
Checked in betalabs (big watchlist (>700 pages) and very few updates happening) and production (few pages on the Watchlist and big volume of updates) - the new query works fast in both cases.
But yes, the best answer is to start using the new servers :-)
Sat, Oct 14
Ok, so suppress, spamblacklist and titleblacklist must not be in the view.
Fri, Oct 13
Btw, it seems the root cause was two things:
- straight_join preventing efficient queries in case the associated page has a few links (e.g. not the giant category use case)
- the query killer not recognizing union queries.
MediaWiki already has a lot of internal support for RSS (For history pages, watchlist, recentchanges), So i doubt it would be super onerous.
I think the idea is that people with complex needs should be encouraged to use ForeignDBRepoViaLB instead.
This probably should be using the tmp_3 index (which is on production but not in mediawiki - its on (rc_timestamp, rc_namespace)) But according to the explain on production its using the rc_timestamp index. [my quick test showed 3.48 seconds when FORCE INDEX for tmp_3 vs 42.65 for query as is. Running the faster query first, so any effects of warm cache would favour the slow one]
Its impossible to optimize this query further unless we denormalize namespace into the revision table (which is unlikely to happen).
It would be even better if we could also have a list of open changesets of newcomers submitted during this quarter. Real-time might be too much, but perhaps renewed each month? With the same goal: identify which projects and reviewers could help, and keep newcomers engaged and learning.
One could fill a book with the areas of wikimedia that nobody or everybody jointly (which is the same as nobody) is responsible for...
Whats the point of keeping newcomers engaged if they become totally ignored the minute they hit some sort of no-longer-a-newcomer class?
assuming that person was on IRC at the time :-)
There has also been discussions on Operations on whether we should alert if mediawiki goes into read only, but this can cause many many false positives that it might be paging every hours :-)
In theory most of the time, phan will already enforce type hints from the comment block, so I'm not sure it would change much in practise in terms of code reliability (Unless we also got better at declaring more specific types).
Usually these filters are pretty selective, but it's possible for them to not be. Thankfully, the kinds of rows that bloat the RC table (Wikidata and categorization) are also among the kinds of rows that don't have ORES scores associated with them. So for wikis with bloated RC tables, the filter will be selective, and typical selections for these filters ("only show bad edits") are quite selective too (<10% on enwiki IIRC). However, the user could choose the reverse filter ("only show good edits") and that wouldn't be very selective.
@awight Btw, some debugging tips for next time:
So I guess part of the problem with the alerting is its looking at individual lag, where the real issue (MediaWiki goes into read-only mode) comes up when all the slaves are lagged more than 6 seconds - which as it stands, MediaWiki doesn't even create a log entry for this situation.
Thu, Oct 12
It will still probably be better then the old query for people with large but not insane sized watchlists, but the savings just wont be as spectacular for big watchlists
Hi, There's been reports of high amounts of lag on commons causing read only mode, especially between 7:10-10:10 UTC today. (I filed T178094) Perhaps the rate of deletion of commons RC entries is too aggresive?
Oh, well in that case, if a particular filter is very selective it may make sense to allow query plans wherethe base table is ores_classification, with some sort of index like (oresc_model, oresc_class, oresc_probability). If we can narrow down the range of revisions (for options of the form only show last three days), putting oresc_rev at the end of that index may make sense too for index condition pushdown. This of course only makes sense if the filter is very selective as the required filesort has a high overhead.
I think it might be cool to have an extension that provides ?uselang=qqs.
Rv my tag change. Wrong bug.
I wonder if its related to deleting all the wikibase stuff in rc(?)
Well in general this sounds nice - there is a potential security issue here in the event that a content handler returns an unsafe mime that could execute scripts. E.g. returning text/html, application/svg+xml would be bad.Some browsers do a lot of sniffing on text/plain (not sure how much that still is the case). And there is also the potential for a format which is downloaded and executed unsafely (e.g. text/csv interpreted by excel. ) which may or not be an issue. Even if not executed could maybe cause a network request to be issued to a third party server which may be unacceptable privacy wise.
I dont think that index exists in production (not sure; i dont remember it one i was looking about an hour ago).
Wed, Oct 11
Copying my comment from T171027#3677195 over here as I commented on the wrong bug, and its good to keep things together for reference:
I'd be happy to see a volunteer take this on, though I note that they'd need some support from us to get the code reviewed and a follow-up performance review undertaken to see whether it would be OK
Well wikidata is almost certainly a contributing factor, (particularly for russian) I would hesitate to blame it solely for slow ores on watchlist speeds, without more evidence. Especially on english wikipedia with its large recentchanges table and relatively low usage of wikidata.
Correction: the response is cached according to the maxage specified in the request; however, I’m not sure if this works if pages are edited or purged (as far as I can tell, the browser doesn’t validate its cached data), so it can be hard to choose the right maxage. (This workaround also requires that you parse and transform the canonical URI, which is ugly.)
My vote is redirect to #wikimedia-dev
Dropping irc channels is not really an operations thing. The channel founder would be the one who'd have to drop it.
I personally find real time code reviews useful in some circumstances:
- if there is something really complicated or tricky happening, going over it in irc can help to make sure all the cases are covered (not really relavent to this ticket since that is rare and generally applies to experianced contribs who are easy to contact over irc)
- for very new contributors, where we are not really reviewing the patch so much as advising people how to contribute - real time communication can be very helpful.
- if there are problems with the patch, real time means a very small feedback loop. Issue is identified and fixed immediately instead of problem noted than fixed a week later and then relooked at 2 weeks later by which time the reviewer has totally forgotten context/doesnt care anymore. (Part of this is a cultural issue where reviewers are discouraged from "helping" get a patch to a ready state because they are then no longer qualified as a neutal reviewer)
- [i suspect] having code review office hours means time gets specificly set aside to do code review. Well this is a poor substitute for actually setting time aside, I suspect its better than the status quo.
Tue, Oct 10
From a security prespective, the main risk (that a per-user off switch could address) is that these features allow making an xss "permenant" (excluding the risk of malicious admin). An option by itself wouldnt prevent that as a malicious person could just turn it back on. Would need to make the user do something like reenter password before reenabling
Any implementation approach you wish to follow? Could you include this in the proposal? I mean how you would solve this problem.
Sat, Oct 7
Wed, Oct 4
I'm curious if this would be better served by $wgEnableScaryTranscluding ?
Tue, Oct 3
Mon, Oct 2
Fri, Sep 29
All the eval()s in the callstack in the first error is kind of weird though. I wonder if corrupted ResourceLoaderStorage is a possibility. But that wouldn't make sense with error being intermittent.
Realistically, this is probably the fault of those gadgets in question, and bugs should be filed with their maintainer.
Thu, Sep 28
@TBolliger Is this something your team would be interested in taking on.
Assuming this works, special pages on commons should start updating again on oct 5.
Wed, Sep 27
With that in mind im going to close this bug. Having revdel kill all the logs automatically sounds like a very reasonable feature request, but its not a newsletter issue.
Tue, Sep 26
Also happening for me on T176055
At the very least i expect people will want it on their user pages because users seem to like making overly ornate user pages. I could also imagine people using it in the portal namrspace.
IMO the RESTBase part doesn't require a security review: it's basically just a web proxy that converts between slightly different URL styles.
Mon, Sep 25
Review of a1b8a6cc70e. (Just the ReadingList extension. I haven't reviewed anything RESTBase related or client side related)
Sun, Sep 24
We should log history of users' logins.
Log user agent string, time, IP of successful logins on ALL wikis.
Make it view-able in account info each wiki.
Sat, Sep 23
They're not fully redundant, since rc_type for Wikidata is RC_EXTERNAL (from core, thus not Wikidata-specific).