Wed, Jun 20
@Smalyshev do you have any thoughts on the idea of reading EG(current_execute_data) from a signal handler as a profiling method?
Tue, Jun 19
The reason wmerrors is long and complicated is because PHP handled request timeout errors by trying to write out the error message from a signal handler. This was extremely unsafe and frequently crashed or deadlocked. In PHP 7.1, this issue was fixed: it now only calls signal-safe functions from the signal handler, setting a flag in the VM for graceful exit and setting a second "hard" timeout to abort the process if the VM doesn't terminate quickly enough.
Mon, Jun 18
@mmodell enabled auth.require-approval two days ago, presumably due to T162026, which is a general task discussing phabricator vandalism, which was escalated to UBN three days ago due to a vandalism spree.
Thu, Jun 14
Claiming this for MWPT at least for some infrastructure work.
T133452 can probably be the other task.
For privacy reasons, there is substantial interest within the WMF for attributing anonymous edits to a session identifier, not to an IP address. I'm going to write up a separate task for that. This task as described, with attribution to the first IP address of the session, probably won't happen, at least not on WMF wikis.
Tue, Jun 12
The GitHub API apparently works just as well without the extension:
I don't understand how we can implement the task as described. It's intentional that write-heavy maintenance scripts go at the speed of the slowest slave. If you only wait for a majority then you could have 50% of slaves permanently lagged, potentially by days or weeks.
Mon, Jun 11
Sun, Jun 10
Sat, Jun 9
Wed, Jun 6
Special:Userlogin starts a session on a GET request so that it can implement CSRF protection on the login form. And that's a localised page name so it's not easy to filter in VCL unless we change the URL in MediaWiki to something more predictable. Not sure if there are other cases, we'd need some sort of audit. If session access is very rare in the secondary DC then could we just tunnel session access to the primary DC, instead of replicating? GET requests causing session creation would be slightly delayed, then the user would get their session cookie and be directed to the primary for subsequent requests.
Tue, Jun 5
To recap, SQLite 3.8+ generates an error when a read transaction is upgraded to a write transaction, if another thread holds the write lock. The solution to this is to use BEGIN IMMEDIATE instead of BEGIN in order to force the transaction to be a write transaction from the start. As a quick hack, a constructor parameter trxMode was added in 22508bec75fb05546520398ac0a195c7c79ef87e , which can be used in CI to force all transactions to be write transactions, at the expense of performance. The transactions would all be serialized instead of allowing parallel reads.
Fri, Jun 1
Thu, May 31
PHP extensions should absolutely not be GPL. The PHP group refuses to distribute them due to incompatibility with the PHP license. Resulting binaries cannot be distributed under any license. MIT is acceptable and often used in this ecosystem.
May 19 2018
Currently, ChronologyProtector times out after 10 seconds, this is configurable. Timeout causes "lagged replica mode" to be set, and Skin::lastModified() responds to this mode by putting a warning in the page footer. Maybe OutputPage::output() would be a good place to send a lagged replica mode response header. Note that lagged replica mode will also be set by LoadBalancer::getReaderIndex() if the lag exceeds the "max lag" parameter, currently 6 seconds in db-eqiad.php. Is it acceptable to send a retry hint to varnish in either case?
I explained ChronologyProtector to @Joe and @BBlack just now. They seemed happy with the idea of not sending a useDC cookie for now, and just relying on ChronologyProtector instead to ensure that the user is presented with an up-to-date view of the site. So the current idea would be to have routing based on HTTP verb, with two exceptions:
- Article 6: Is IP address attribution for anons "necessary"?
- Article 15: Right of access by the data subject. Do we need to e.g. give people access to their own checkuser data, and is it a security problem to make that accessible without a human in the loop?
- Article 17: Right to erasure. How do we respond to requests for erasure of a username or anon IP? Do we need to provide a way to fully erase content via the web UI?
- Article 20: Data portability. A personal view of "right to fork". Any additional tools needed? Preferences, watchlist export/import? Contributions mode of Special:Export? Uploaded files?
- Article 34: Communication of a personal data breach to the data subject. Currently there is no mass mailing extension. For WMF board elections we write a custom maintenance script each time.
May 18 2018
May 15 2018
If you want this to be an RFC then you need to tag it with TechCom-RFC . But I'm not sure if it's ready for public discussion yet, since it's mostly just a problem statement.
May 9 2018
Filed upstream proposal at https://github.com/php-memcached-dev/php-memcached/issues/395
Apr 18 2018
Apr 12 2018
Mar 20 2018
I took a stab at doing this myself. I added an email address and password to the LDAP user using the tools described at https://wikitech.wikimedia.org/wiki/LDAP , tested logging in as "simetrical" to wikitech.wikimedia.org, and then sent a password reminder email. You should have received a temporary password which you can use to log in and set a new password for the account. Then you should be able to log in to the Gerrit web interface and set up SSH keys using https://gerrit.wikimedia.org/r/#/settings/ssh-keys
Mar 16 2018
This is required by WCAG 2.0 SC 2.2.2: "For any moving, blinking or scrolling information that (1) starts automatically, (2) lasts more than five seconds, and (3) is presented in parallel with other content, there is a mechanism for the user to pause, stop, or hide it unless the movement, blinking, or scrolling is part of an activity where it is essential"
WCAG 2.0 SC 1.1.1 requires non-text CAPTCHAs to have "alternative forms of CAPTCHA using output modes for different types of sensory perception are provided to accommodate different disabilities"
Mar 14 2018
The relevant WCAG 2.0 success criterion is SC 2.4.1 Bypass Blocks: "A mechanism is available to bypass blocks of content that are repeated on multiple Web pages. (Level A) "
WCAG 2.0 SC 2.4.9 Link Purpose (Link Only): "A mechanism is available to allow the purpose of each link to be identified from link text alone, except where the purpose of the link would be ambiguous to users in general. (Level AAA) "
If the icon is conveying information then it is a violation of WCAG 2.0 SC 1.1.1 "Non-text Content" if we do not provide a text equivalent. The only way out of this is to claim that it is "pure decoration", which I don't think is true.
I think the existing behaviour is OK under WCAG 2.0 SC 1.3.1 since the same information is available in the <title> and <h1>.
The title attribute only contains the name of the page, which is the same in every case, it doesn't contain the heading or anchor. Improving the title attribute would satisfy WCAG 2.0 SC 2.4.9, although begrudgingly, since note H33 recommends using CSS to hide part of the link text instead, due to "extensive user agent limitations" with the title attribute.
Whether this is a WCAG 2.0 Level A or Level AAA conformance issue depends on whether you think the link purpose can be determined from the "programmatically determined link context", which includes other links in the same list. I think the answer is probably yes, so it's only Level AAA SC 2.4.9 "Link Purpose (Link Only): A mechanism is available to allow the purpose of each link to be identified from link text alone, except where the purpose of the link would be ambiguous to users in general."
This is not a WCAG 2.0 issue since there is a text alternative by WCAG's definition of the term.
The existing behaviour violates WCAG 2.0 SC 2.2.1 Timing Adjustable (Level A). Allowing the user to disable auto-hide in preferences would satisfy SC 2.2.1 but would probably not be Level AAA compliant due to SC 2.2.3 "No Timing".
This is really usability rather than accessibility.
The checkboxes in the Notifications tab of Special:Preferences, from the Echo extension, appear to violate WCAG 2.0 SC 1.1.1 (Level A). I don't see any label for those checkboxes, not even an empty label. Each checkbox needs to have a human-readable name that describes its purpose. Presumably that means the "web" and "email" checkboxes on each row need to have different names. An aria-label attribute would be sufficient.
Not required by WCAG 2.0
This is a violation of WCAG 2.0 SC 1.3.1: Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text. (Level A)
WCAG 2.0 SC 2.1.2 No Keyboard Trap (Level A)
The only WCAG 2.0 violation that I see in here is "The file size textfields for the size are not labelled". This violates SC 1.1.1 "If non-text content is a control or accepts user input, then it has a name that describes its purpose." (Level A)
This is not a WCAG violation if there are no callers that use this.
This is probably a WCAG 2.0 SC 2.1.1 conformance failure. You can type the language name, but if "suggested languages" counts as functionality then it's a violation to not give keyboard access to it.
This is a violation of WCAG 2.0 Level AA 2.4.7 Focus Visible: Any keyboard operable user interface has a mode of operation where the keyboard focus indicator is visible.
Based on the screenshot, this is a failure of WCAG 2.0 Level A guideline 1.4.1 "Color is not used as the only visual means of conveying information, indicating an action, prompting a response, or distinguishing a visual element."
Not a WCAG 2.0 violation in my opinion, on the basis that the logo is pure decoration. It is deliberately hidden from screen readers.
Mar 13 2018
Mar 12 2018
So we need a Greenhouse admin to go to Configure > Email Settings, then enter "careers.wikimedia.org" for the domain and click "Register". Then click "Email your I.T. dept" and send it to email@example.com.
Mar 9 2018
To get a bit more philosophical, my theory is that improving and encouraging communication of the rationale for an edit will help to avoid conflict and anger.
Proof-of-concept display-side truncation uploaded at https://gerrit.wikimedia.org/r/#/c/417593/
I think 500 is too long for change lists, I think it should be more like 150. When you require the same length for storage and display, you have to make awkward trade-offs. Edit summaries on Wikipedia are usually much shorter than 500 characters, whereas commit messages in MW git are very often longer than 500 characters. I don't think MediaWiki is fundamentally more complicated than a Wikipedia article, I think Wikipedians are just accustomed to not explaining what they are doing, due to UI decisions by developers.
My preference is to leave the limit as it is, and to collapse or truncate edit summaries in the change list UI. There's no reason to limit the size on the diff page.
Mar 8 2018
How about I change the task title so that it can stay open? Because the real problem here is that outbound email is broken, I don't care whether SPF or a subdomain is used to fix it. There's no MX record for careers.wikimedia.org.
Mar 7 2018
Mar 6 2018
Sanitizer is in fact not involved. The Parser wraps the TOC with <mw:toc>...</mw:toc>. The input to Tidy is:
Mar 1 2018
This comes from pybal monitoring. The client IP addresses identify LVS hosts, and you can see the relevant requests using tcpdump. I tried to reproduce using nc, which was mostly unsuccessful, except for one request to http://fr.wikipedia.org/wiki/Special:BlankPage .
Feb 28 2018
Fix uploaded at https://gerrit.wikimedia.org/r/#/c/415207/ . Any objections to removing the policy? This is not a DOS attack. It's apparently not even a DOS attack vector, since the memory limit is doing its job, and the CPU time is short.
Feb 21 2018
The issue is that some GET requests will be routed to the secondary data centre. Eliminating all writes in such requests is not feasible and was abandoned as a goal very early in the planning. Instead, when MW calls wfGetDB(DB_MASTER) from the secondary datacentre, it will connect to the remote master database. This takes about 6 RTTs per SSL connection plus 1 RTT per query. So avoiding such queries is useful for performance.
I'm not sure this needs to be done, since it is already deferred until post-send so should not have any impact on user-visible latency.
It looks like this is deferred until post-send, so there shouldn't be a user-visible latency impact. I'm assuming we're talking about LocalFile::maybeUpgradeRow().
The call of ArticleCompileProcessor::compileMetadata() from from ArticleMetadata::getMetadata() is supposedly a "very rare case" according to a source code comment. It's incorrect in this case to do BasicData compilation from the master. ArticleMetadata::getMetadata() has just loaded the page information from the replica, so it knows all the data is already in the replica. It should override the componentDb config so that all data comes from the replica. Note that the backtrace shows this information being prepared for the footer of a page view. Page views use existence information from the replica DB, so if you loaded PageTriage data from the master, the best you could do would be to display a triage footer on a "page does not exist" error message.
So this only affects users with user_token=''? The field is not nullable. Apparently only users created between March and June 2012 are affected by this:
I revision-deleted the revision in question.
Feb 20 2018
It's not a DOS, there's not enough requests. It's just a broken bot hitting the same old revision via action=parse again and again. Specifically 256170852, a revision of [[Barack Obama]] from 2008. The solution is to find the person running this bot and to tell them to stop doing it.
There's no longer a 512MB limit imposed by wfShellExec(). Monitoring bytecode cache size may be useful if there is a risk of disk space exhaustion or performance degradation, but this is no longer necessary for T146285, so I'm removing it as a parent and reducing priority. Possibly the task could be declined if we switch mwscript back to HHVM and no issues are encountered.