Sun, Jun 5
@BBlack The last status update on this bug was ~18 months ago, and indicated the issue was an upstream bug and you were following up there, with a fallback to a WMF-specific patch if upstream got stuck. I see no indication there is any question this behaviour is a bug (cf. eg. Krinkle's comment above). It's also a problem that makes certain pages inaccessible on all projects, breaks contribution histories and other core features for certain users, and necessitates manually prohibiting a character in page and user names that is intended to be permitted.
Mar 6 2022
Hmm. Then I think the problem statement is a little bit the wrong way around: it reads as if the aim is to lock down a currently reigning "Wild West" state of affairs, but in light of your clarification it sounds like the focus is more to enable a use of third-party resources that are difficult or impossible to (legally) do today but which would be of benefit to the Movement. And, obviously, when enabling such use it is desirable to do so in a controlled way that prioritises privacy, is sustainable, does not negatively impact performance, and so forth; but this then is not the focus so much as the consequence.
Mar 2 2022
Feb 25 2022
Feb 24 2022
The layout changes for the pagelist in this task are kinda suboptimal. There seems to be significantly increased margins around each page label, and in addition each label is now inline rather than block so the containers (on which the background is applied) collapse down to essentially just the character boxes. See the before—after screenshots below:
Feb 13 2022
Feb 12 2022
Jan 31 2022
Jan 30 2022
Jan 24 2022
Dollars to dimes they're just scraping works for some ebook site based on the OPML.
Jan 17 2022
Jan 13 2022
It is probably worth revisiting, yes, but note that interactive performance here depends on dynamically generated "thumbnail" images, where the generation involves shelling out to ghostscript (for PDFs) and ddjvu (for DjVu) to extract a given page from a possibly ~1GB multi-page document and rendering it to a JPEG. Ghostscript, in particular, can (anecdotally) take ~20 seconds to do this.
Jan 12 2022
Last I heard (I could be wrong), MediaWiki-extensions-PdfHandler (ping @Tgr) uses Ghostscript to render PDFs to JPEG thumbnails, meaning this is most likely an upstream bug affecting certain born-digital PDFs. Best case for fixing it is probably using a newer version of ghostscript, which I'm guessing would be blocked on T289228. If it can be reproduced in base latest-version ghostscript it should probably reported upstream, and a fix here would then also depend on when upstream makes a release with a fix. Alternately, there is T38594; but I suspect it'd be fairly resource-intensive on the MediaWiki side, and I have no idea what the relative merits of Ghostscript and MuPDF are. A switch might conceivably have a positive effect on the problem described in T242169 (or it might not; or it might make it worse).
Jan 7 2022
The user that first reported this issue on enWS has tested and confirmed that it is now fixed for them. And my own testing concurs.
Crap, I notice we don't have any regular backport windows until Monday. While this wasn't train-blocking/rollback when noticed, it's probably UBL-y enough that it needs to go out as an emergency backport. It doesn't break the primary workflow in that the Wikisourcen can still edit pages, it does break the zoom/pan for the reference page image (which is highly-used functionality) and partially breaks (blocks) the UI for setting page status (particularly for people with page zoom on in the browser, which is a major accessibility issue and primary workflow).
Jan 6 2022
Updated patch looks good to me, but I'm not sufficiently familiar with the context there for that to mean much.
Patchdemo looks good to me: no console errors, all the OSD stuff (zoom, rotate, etc.) looks like it works, and it even seems to fix the unresponsive radio buttons that were reported on-wiki.
Ah, oh, yes, I see. When we start, $imgContHorizontal is initialized to null (what's the point of that long chain of vars getting set to null at the top? They should either be set to something useful or their initialization might as well be deferred until first used.). It doesn't actually get assigned any value until toggleLayout() is called, and there it's wrapped inside a conditional such that it is only initialized if the layout is already horizontal (if (!isLayoutHorizontal)). When we start in vertical mode the var is never initialized, until something actually triggers toggleLayout(); but then a few lines later $imgContHorizontal.add() is called regardless of which branch was picked in the preceding conditional.
But why is $imgContHorizontal undefined here?
Jan 5 2022
@santhosh You're listed as the maintainer for jQuery.WebFonts so I tagged you on this. If this isn't your bailiwick I'd appreciate a pointer in the right direction.
Jan 3 2022
Dec 29 2021
@Agabi10 Did you give up on this patch? It looks really useful to me, and from the review comments it didn't look like there were any insurmountable problems (just maybe some documentable limitations due to the architecture context).
There's no obvious reason Scribunto should unconditionally enforce this when Lua as such doesn't, but that's an issue of defaults and somewhat orthogonal, IMO.
@RoySmith I've updated the task description to point at the relevant thread in the archives. But perhaps you (or perhaps @Pppery) could summarise the information learned in that thread for the benefits of others stumbling across similar issues and searches Phabricator?
This sounds really cool, but I am having trouble seeing what the use case for it is. What kind of data would one put in a "derived slot"? And since it is programatically generated, why not handle it in the parser and its cache?
Dec 22 2021
Dec 16 2021
Based on that error message it smells like a namespace collision. deWS has a lot of site-local JS that uses (and defines) a "SetCookie" function. That's a pretty generic name for such a function, so it may well be clashing with a name from elsewhere (e.g. the recent OSD stuff).
@Satirdan_kahraman The cover image (page 1) just needed a null edit, so there was apparently some kind of caching issue. I checked a few of the pages that have not yet been proofread and they appeared to load correctly.
Dec 2 2021
@bd808 I still can't find any mention of Toolforge on mw:GitLab or its subpages, so do we need a separate task to set policy for access/groups/directory layout in GitLab? What's currently documented only deals with WMF and WMDE components, and I am not immediately convinced that the assumptions for those hold in the context of Toolforge.
Dec 1 2021
@Tpt As best I can tell, the automatic next link on the main page was the existing behaviour. When the work's title was in $indexLinks, the if test inside the for loop would get a hit and set $current. The prev link test then explicitly excluded it by skipping index 0, but the next test just checked that it wasn't the last entry in the array ($i + 1 < $indexLinksCount) and so would add a next link on the page corresponding to the work's title.
Nov 30 2021
Hmm. I may be insufficiently caffeinated just now, but…
Nov 24 2021
@Alex_brollo I don't think reversing a change that brings big improvements is the way to go (and this cannot function as a gadget), but your use case is interesting. Instead of trying to manually undo the changes I would suggest you try to find ways to use the new facilities to do what you're after. A lot of it I would expect to already be possible (OSD has an API exposed that can be used), and what's missing are probably good candidates for adding. If you explain your needs it might be possible to suggest alternate approaches for them.
Nov 20 2021
Hmm. No smoking gun there that I can find.
Nov 18 2021
Dollars to dimes it's choking on c:File:Cyclopaedia of English Literature 1844 Volume 1 page 548.djvu which is in that category and is currently showing up with 0x0 dimensions and 0 pages. The djvudump structure for this files is:
Nov 17 2021
Hmm. Something is wonky here…
Nov 16 2021
Is this perhaps simply another symptom of T292954?
So this task can probably be closed?
@2db Which file (IA identifier) was this? Do you still get the error if you retry now? (There have been some rather big changes to relevant bits of the code recently that might conceivably have affected this)
I just retried this file and intervening hosting environment changes appear to have fixed the issue. The file is now available at c:File:Indemnities of War- Subsidies and Loans (1920).djvu
This task is strictly speaking stale: the problem here was fixed a long time ago, then Match&Split broke again due to T280806, and has now been ported (more or less) to Pywikibot 6.x and Python 3. And while my ability to commit to owning phetools in toto is limited, I now have access to the tools account and have dug around in the code enough that if anything breaks then please tag me in and I'll see what I can do.
Verified: all the files mentioned above show correct dimensions and thumbnails, both on Commons and Wikisource, and Proofread Page is now able to work with them without choking.
Nov 10 2021
Nov 8 2021
Just in case it's useful, some info about the file itself:
Nov 1 2021
Oct 31 2021
c:File:A_Latin_Dictionary_(1984).djvu shows up with the right dimensions, thumbnails, etc.
s:File:A Latin Dictionary (1984).djvu shows up with 0x0 dimensions, no thumbs, etc.
Ok, just reuploaded a 1.67GB (1,792,272,952 bytes) file that failed a week and a half ago (it took literally 24 hours to regenerate it from scans: I really need to get a faster laptop): c:File:A Latin Dictionary (1984).djvu
@Gwennie-nyan @Askeuhd There's been a patch applied that fixes one problem that gave symptoms exactly like those described in this task. Could you both retry the uploads you mentioned to see if they go through now?
@aborrero Did you specify a -chunked to pwb.py, and if so what (5MB perhaps?)? And did you give it -async?
Based on T292954 and related tasks I'd say it'd be worthwhile to retry the various failed uploads mentioned in this task. Some or all of them may now succeed, and if any still fail then at least one big potential cause has been eliminated (the locking stuff looks like it could be a separate issue from the one recently fixed in T292954).
@Inductiveload Is this resolved now?
@Inductiveload Time to retry?
Oct 30 2021
Ok, reupload of a previously problematic file—c:File:Frank Leslie's Illustrated Newspaper Vol. 18.djvu—using Rillke's BCU.
Oct 29 2021
"Delivering HTTP/2 upload speed improvements" from Cloudflare may be of use for Swift people looking into this or similar issues. I haven't mapped the lab numbers they use to the scenario here, but high-bandwidth+high-latency usually means you run into the Long Fat Pipe problem where TCP window scaling comes into play. And since HTTTP/2 is essentially reinventing TCP at the application layer (just with "frames" instead of "packets") it seems likely that what we're seeing here is essentially broken "HTTP window scaling" (cf. RFC7540 6.9.2): the receiver (in this case Swift) sets the window size, and is here using the 64k default size which libcurl faithfully obeys (curl itself, for downloads, switched its default to ~32MB in 7.52.0). In other words, it looks like this can't actually be tuned on the MW (libcurl) side but must happen on the Swift side if it's to be supported.
I don't think curl sends Expect: 100-continue for chunked transfers to begin with, and I don't think chunks need to be ack'ed before sending the next in HTTP/1.1.
Oct 28 2021
Oct 21 2021
Oct 15 2021
Oct 14 2021
Does the CI-related patch to PRP's tests from T292676 need to be backed out also or is that in practice unrelated?
Oct 11 2021
Ah. "Large uploads to Commons are broken and need to be fixed" seems like a good Epic, when it manifests in multiple ways (see subtasks) and will require multiple interventions to resolve (moving imginfo to bulk storage, moving OCR out of metadata, "do something" so the job queue doesn't choke on publishing these out of stash, etc.). That satisfies the "can be fixed, even if it takes time" criterion.
Note that the ProofreadPage test failure was not a broken test, it detected actual breakage. ProofreadPage stores multiple semi-structured parts of a page using noinclude sections (it predates MCR and other relevant facilities), and then hides these from users by hooking into the diff.
Aug 10 2021
It's the same issue as with references: <section> is an extension tag so the content is processed by Extension:Labeled Section Transclusion, much like <ref> is processed by Extension:Cite. I've never dived into why exactly that is, but in general there are issues with ordering and recursion that makes extension tag content inside extension tag content hard to support in the general case. Which means both these issues will need specific support in the respective extensions to solve.
Aug 5 2021
Aug 1 2021
I just ran into (what I think is) this with the construct:
Jul 20 2021
Jul 19 2021
Sounds like a reasonable tradeoff to me.
It does not fall strictly within "language support" in terms of i18n/i10n.
Jul 17 2021
I think we're overcomplicating this.
Jul 9 2021
If resource consumption is a concern, I think a single page will cover the wast majority of cases; and the exceptions have a fairly benign failure mode (it's an additional optimisation). But, yeah, "what Inductiveload said", essentially.
Ok, an attempt at some concrete suggestions…