Page MenuHomePhabricator
Feed Advanced Search

Sat, Apr 17

Xover created T280448: Pages with non-wikitext content model are put into Pages using duplicate arguments in template calls.
Sat, Apr 17, 3:48 PM · MediaWiki-Categories, MediaWiki-Parser

Thu, Apr 8

Xover added a comment to T269628: Wikisource: investigate what data we can collect on OCR tools & potential instrumentation.

@ldelench_wmf That page is counting pages that have been marked as "Proofread" or "Validated"—using the radioboxes the Proofread Page extension adds to the edit form—as a result of a manual transcription, that may or may not have used OCR text from one of several different possible sources as a starting point. It does not directly measure anything related to OCR (but could, of course, conceivably provide an indirect measure).

Thu, Apr 8, 4:28 PM · Community-Tech, Wikimedia OCR
Xover added a comment to T269518: IA Upload: Permit duplicate IA identifier if of a different format.

Do we want to allow duplicates of the same format?

Thu, Apr 8, 12:27 PM · IA Upload, Community-Tech

Mon, Apr 5

Xover added a comment to T275100: Change IA upload disallow on duplicate to a challenge.

The very simplest way would be to just change pageForIAItem() to always return an empty string.

Mon, Apr 5, 10:38 AM · Community-Tech, IA Upload, Internet-Archive
Xover added a comment to T279118: Wikisource OCR: add support for tesseract on wikimedia ocr .

@Samwilson A couple of thoughts on skimming (and I do mean skimming) the diff…

Mon, Apr 5, 9:43 AM · Community-Tech (Kanban-2020-21-Q4), Wikimedia OCR, Wikisource

Thu, Apr 1

Xover updated subscribers of T268240: Provide a mechanism for detecting duplicate files in commons and a local wiki.

In fact, looking at the code in SpecialFileDuplicateSearch.php it looks like querying for Commons media isn't particularly more complicated than local media when inside core, and T175088 suggests Special:ListDuplicatedFiles should be on the monthly "expensive query pages" cron job in any case. In that context, is there any particular reason SpecialListDuplicatedFiles.php for a given project couldn't do a (very specialised version of a) cross-wiki join itself and stuff the results in a category?

Thu, Apr 1, 5:36 PM · Data-Services, cloud-services-team (Kanban)
Xover added a comment to T268240: Provide a mechanism for detecting duplicate files in commons and a local wiki.

The file usage section on file pages lists duplicates, including from Commons. However, there is no way to find these since Special:ListDuplicatedFiles only lists local duplicates.

Thu, Apr 1, 4:44 PM · Data-Services, cloud-services-team (Kanban)
Xover added a comment to T268240: Provide a mechanism for detecting duplicate files in commons and a local wiki.

I'm sure there are other uses for the functionality described here, but…

Thu, Apr 1, 1:32 PM · Data-Services, cloud-services-team (Kanban)
Xover added a comment to T277768: Wikisource: Investigate adding support for bulk OCR to Wikimedia OCR [16H].

I think the current OCR tool will read ahead in the current file and OCR the other pages in the background and cache the results, on the assumption that if you want one, you or others will want more. But I'm not sure how far ahead it goes.

Thu, Apr 1, 10:35 AM · Community-Tech (Kanban-2020-21-Q4), Wikimedia OCR, Wikisource
Xover added a comment to T277768: Wikisource: Investigate adding support for bulk OCR to Wikimedia OCR [16H].
  • it's not possible to add the text layer to the PDF/DjVu/etc.
Thu, Apr 1, 9:14 AM · Community-Tech (Kanban-2020-21-Q4), Wikimedia OCR, Wikisource

Wed, Mar 31

Xover added a comment to T278623: Create a Section for Numerically Sequencing Images on Index ns.

For your conversion of Lippincot's v45 from Hathi, you can do a lot better:

Wed, Mar 31, 6:46 PM · ProofreadPage
Xover added a comment to T278623: Create a Section for Numerically Sequencing Images on Index ns.

There are multiple issues with PDF.

Wed, Mar 31, 5:19 AM · ProofreadPage
Xover added a comment to T278443: Wikisource OCR: fix issue with lines being formatted incorrectly.

As Peter says, this needs some form of configurability and probably at the per-user level. English Wikisource generally unwraps lines, but even there there are users who rely on hard linebreaks when proofreading. OCR is also imperfect at detecting page features, so for some scans automatic unwrapping will end up going to the opposite extreme (all text in one big lump with no line breaks).

Wed, Mar 31, 4:47 AM · Wikimedia OCR, Wikisource, Community-Tech

Sun, Mar 28

Xover added a comment to T278104: Unable to upload to Commons: uploadstash-file-not-found: Key "187kyl5ozj74.xtav8j.51508.djvu" not found in stash.

@Aklapper I'm not entirely steady on the projects/components and their scope, so apologies if I'm hopelessly confused, but looking at the descriptions for them I would say this task falls under MediaWiki-Uploading and UploadWizard? Or is this obviously pinpointed somewhere down in the Swift part of the stack? And maybe UploadWizard is excluded since this happens via API upload too?

Sun, Mar 28, 5:55 PM · SRE-swift-storage, User-Inductiveload

Mon, Mar 22

Xover added a comment to T278104: Unable to upload to Commons: uploadstash-file-not-found: Key "187kyl5ozj74.xtav8j.51508.djvu" not found in stash.

Possibly related: T254459

Mon, Mar 22, 5:30 PM · SRE-swift-storage, User-Inductiveload
Xover added a comment to T278104: Unable to upload to Commons: uploadstash-file-not-found: Key "187kyl5ozj74.xtav8j.51508.djvu" not found in stash.

Ok, testing the >100MB file locally on enWS (I think most of the relevant bits of the stack are the same as for Commons), bigChunkedUpload.js tells me "Upload is stuck" for every single chunk (32 x 20MB chunks) but then seems to recover. After the last chunk hits 100% it tells me "Server error 0 after uploading chunk:" (I think this is an empty response from the server). After waiting and retrying a couple more times it terminates with the message "FAILED: internal_api_error_DBQueryError: [91f56af6-cec2-4969-938f-3aeaf9f35aff] Caught exception of type Wikimedia\Rdbms\DBQueryError" which I'm pretty certain is coming from somewhere inside MW proper rather than from Rillke's code.

Mon, Mar 22, 5:06 PM · SRE-swift-storage, User-Inductiveload
Xover added a comment to T278104: Unable to upload to Commons: uploadstash-file-not-found: Key "187kyl5ozj74.xtav8j.51508.djvu" not found in stash.

I've successfully uploaded several <100MB files in the time period. The one >100MB file I've tried fails (I've been blindly trying different things so exact failure symptoms are a bit vague). All uploads with bigChunkedUpload.js with stash/async deselected.

Mon, Mar 22, 2:07 PM · SRE-swift-storage, User-Inductiveload

Mar 16 2021

Xover added a comment to T276672: WS Export: Create separate credits page that can be viewed by everyone.

Random, possibly not useful or relevant, thought: there's an effort somewhere to tighten the privacy policy in such a way that IP addresses are no longer visible (not even to Checkusers). IPs are also not very useful as an entry in a "Contributors to this book" list. Perhaps both issues could be addressed by grouping all logged-out contributions at the end as "…, and n anonymous contributors."?

Mar 16 2021, 9:23 AM · Community-Tech, WS Export
Xover added a comment to T274959: Wikisource: Create option to disable credits in WSExport form.

Credits by default may be playing it safe, but does the risk really justify that much caution?

Mar 16 2021, 9:19 AM · Community-Tech (Kanban-2020-21-Q3), WS Export, Wikisource
Xover added a comment to T277435: Include copyright metadata based on Wikidata P6216.

Hmm. Does it actually need to be machine-readable? I would have thought what was wanted was a way to just identify the license template output so that it could be rendered in the appropriate place, but otherwise just use the on-wiki rendered template. Structured data is nice for all sorts of other reasons, but for this purpose I would think a simple CSS class would be sufficient; or possibly an ID in order to ensure there is only one container for license information.

Mar 16 2021, 9:07 AM · WS Export, Community-Tech

Mar 6 2021

Xover added a comment to T274959: Wikisource: Create option to disable credits in WSExport form.

The Wikisourcen (unlike Wikipedia) do not create original content that attracts a copyright.

Mar 6 2021, 10:51 AM · Community-Tech (Kanban-2020-21-Q3), WS Export, Wikisource
Xover added a comment to T274959: Wikisource: Create option to disable credits in WSExport form.

… we won’t be showing it to most downloaders.

Mar 6 2021, 10:43 AM · Community-Tech (Kanban-2020-21-Q3), WS Export, Wikisource

Mar 4 2021

Xover added a comment to T274959: Wikisource: Create option to disable credits in WSExport form.

@Prtksxna The Wikisourcen (unlike Wikipedia) do not create original content that attracts a copyright. They merely (mechanically) reproduce public domain or already-freely-licensed works. The standard licensing terms under the edit form are for contributions outside the content namespaces (Scriptorium, User pages, Talk, etc.). Thus the only relevant licensing information is the one for the work itself, much as the licensing for a media file on Commons.

Mar 4 2021, 6:45 PM · Community-Tech (Kanban-2020-21-Q3), WS Export, Wikisource
Xover added a comment to T273708: Don't show download button on subpages, and opt-out for top-level pages.

Not very good idea. There are works like encyclopedias or periodicals with thousands of subpages.
This solutions would need to add magic word to every subpage.

Mar 4 2021, 7:41 AM · Wikisource, WS Export, Community-Tech

Mar 2 2021

Xover added a comment to T271710: Allow sanitized CSS subpages in the Index namespace of Wikisource.

Should the config change be a separate task for Site-Requests to be visible on the board?

Mar 2 2021, 4:58 PM · MW-1.36-notes (1.36.0-wmf.32; 2021-02-23), Wikisource, TemplateStyles, ProofreadPage
Xover added a watcher for User-Inductiveload: Xover.
Mar 2 2021, 8:37 AM

Feb 27 2021

Xover added a comment to T43614: ProofreadPage does not use image's full resolution when zooming in.

Hmm. As I recall, PRP uses a hard 1024px size for the "thumbnail" it requests. I am assuming this was a value picked as a sort of compromise between full fidelity to the user and various optimization concerns.

Feb 27 2021, 9:23 AM · Wikisource, ProofreadPage
Xover added a comment to T265219: Wikisource: Internet Archive Upload Fail.

Hmm. Based on this and a few other recent failures, I'm starting to wonder if php-exec-command (which is the Command::exec(); wrapper ia-upload is using to execute binaries) is broken and returning "Command not found" for any non-zero exit status.

Feb 27 2021, 9:10 AM · Wikisource, IA Upload
Xover added a project to T275912: Create an Importer for Distributed Proofreaders (pgdp.net) for Wikisource: Wikisource.
Feb 27 2021, 7:55 AM · Wikisource, importbots

Feb 26 2021

Xover renamed T275735: Change api cache ttl to be an .env var from Change api cache ttl to be an .evn var to Change api cache ttl to be an .env var.
Feb 26 2021, 10:34 AM · Community-Tech (Kanban-2020-21-Q3), WS Export

Feb 25 2021

Xover added a comment to T101075: Do not save unused (or deliberately removed) suggested parameters when inserting or editing transclusions.

… there's a difference in wikitext between an empty parameter and a not-provided-at-all parameter, …

Feb 25 2021, 8:44 AM · Skipped QA, User-Ryasmeen, MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), Editing-team (FY2020-21 Kanban Board), VisualEditor-MediaWiki-Templates, VisualEditor

Feb 19 2021

Xover added a comment to T257066: Extension:Score / Lilypond is disabled on all wikis.

There is some progress being made on various protected tasks, …

Feb 19 2021, 4:06 PM · MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), User-notice, Security-Team, Security, Wikimedia-General-or-Unknown, MediaWiki-extensions-Score, SRE
Xover added a comment to T257066: Extension:Score / Lilypond is disabled on all wikis.

So… we're currently waiting for a suitable volunteer to materialize out of thin air to address an issue whose details are not public for security reasons? And in the mean time we have many thousand broken pages across multiple projects and all we can do is bleed contributors in those areas?

Feb 19 2021, 1:21 PM · MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), User-notice, Security-Team, Security, Wikimedia-General-or-Unknown, MediaWiki-extensions-Score, SRE

Feb 11 2021

Xover added a comment to T274495: Genericize language on the Wikisource download button to remove specific models of tablet.

Absent specific proposals for better wording

I gave a proposal.

Feb 11 2021, 1:06 PM · Community-Tech, WS Export
Xover added a comment to T274495: Genericize language on the Wikisource download button to remove specific models of tablet.

Absent specific proposals for better wording I think the status quo works well enough. Far from every ebook user has any conception of file formats, much less any idea what kind is best for their device, so giving them enough information suited to their frame of reference to make a sensible choice is a priority.

Feb 11 2021, 12:37 PM · Community-Tech, WS Export

Feb 6 2021

Xover added a comment to T274027: WS Export: Don't show sidebar links in Page and Index namespaces.

Let me throw an extra angel on the head of this needle: a user might conceivably want to export a work when currently on a wikipage in these namespaces, and a user might conceivably want to export a single page, as defined by a Page: wikipage, of a work.

Feb 6 2021, 8:57 AM · Community-Tech, WS Export
Xover added a comment to T269726: Make 'pdf' format an alias for 'pdf-a5'.

I think that for any inherently paged format (like PDF), print should be a primary concern. For everything else we should nudge people to ePub where content can be dynamically reflowed. I have trouble imagining that a significant number of people actually print these onto dead trees, but that is the main rationale for the design of the PDF format the way it is.

Feb 6 2021, 8:35 AM · Community-Tech (Kanban-2020-21-Q3), WS Export
Xover added a comment to T269726: Make 'pdf' format an alias for 'pdf-a5'.

Uhm. A5? Every printer in the world is designed for A4 (or its bastard offshoot, US Letter), and every sheet of printer paper sold ditto. The other sizes, including A5, are barely measurable in comparison. In fact, I think some of the B sizes may actually outsell A5 due to use in automated mass-mailings of various kinds.

Feb 6 2021, 8:03 AM · Community-Tech (Kanban-2020-21-Q3), WS Export

Jan 20 2021

Xover added a comment to T272253: WS Export: open 'choose formats' link in new tab.

No, please don't. Forcing links to open in a new tab or window to keep the user on your site is literally a dark pattern in web design. Users are quite capable of opening a link in a new tab if they want to, and, conversely, those users who have trouble with this are also apt to be confused by navigating multiple tabs or windows.

Jan 20 2021, 7:24 AM · Wikisource, Community-Tech, WS Export

Jan 13 2021

Xover created T271958: Support "width: fit-content" in TemplateStyles/Sanitized CSS.
Jan 13 2021, 5:06 PM · css-sanitizer, TemplateStyles

Jan 10 2021

Xover added a watcher for WS Export: Xover.
Jan 10 2021, 1:26 PM

Dec 21 2020

Xover added a comment to T134469: doBlockLevels() inserts <p> and </p> randomly with no regard for HTML validity.

I bet something like __NO_P_WRAP__ would be fairly easy to support. Would it get enough adoption to get us closer to our goal of turning it off by default?

Dec 21 2020, 5:21 PM · MediaWiki-Parser

Dec 20 2020

Xover added a comment to T134469: doBlockLevels() inserts <p> and </p> randomly with no regard for HTML validity.

… In ten years, I'd love for us to be at the point where we don't do <p>-wrapping at all …

Dec 20 2020, 10:14 AM · MediaWiki-Parser

Dec 18 2020

Xover added a comment to T270387: Enable OPDS catalog for English Wikisource.

Yeah, daily would be better for newly added works. For changes to existing works the frequency could be much lower with not much problem I think. Alternatively new works could be manually triggered (we have lots of manual processes already) given an interface for it.

Dec 18 2020, 8:36 PM · Community-Tech (Kanban-2020-21-Q3), WS Export

Dec 11 2020

Xover updated subscribers of T230415: Stop ignoring paragraph and region separators in DjVu file OCR text layer.

Oh, no, wait… I think I'm just being a dummy!

Dec 11 2020, 1:52 PM · MW-1.36-notes (1.36.0-wmf.37; 2021-03-30), Wikisource, MediaWiki-DjVu

Dec 10 2020

Xover added a comment to T230415: Stop ignoring paragraph and region separators in DjVu file OCR text layer.

It definitely isn't working. On this page the paragraphs run together, but the output from djvutxt thefile.djvu -page=17 -detail=page is:

Dec 10 2020, 9:41 PM · MW-1.36-notes (1.36.0-wmf.37; 2021-03-30), Wikisource, MediaWiki-DjVu
Xover added a comment to T230415: Stop ignoring paragraph and region separators in DjVu file OCR text layer.

Hmm. $wgDjvuTxt is set in CommonSettings.php, so that should be ok.

Dec 10 2020, 7:18 PM · MW-1.36-notes (1.36.0-wmf.37; 2021-03-30), Wikisource, MediaWiki-DjVu
Xover added a comment to T230415: Stop ignoring paragraph and region separators in DjVu file OCR text layer.

Hmm. I didn't think there'd be any caching of this, but I may have misunderstood. It might also be that retrieveMetaData() is called once on upload rather than on demand as I'd assumed. And we need to check what $wgDjvuTxt is set to, since this whole block is only executed if that config var isset().

Dec 10 2020, 7:28 AM · MW-1.36-notes (1.36.0-wmf.37; 2021-03-30), Wikisource, MediaWiki-DjVu

Nov 18 2020

Xover added a comment to T215858: Plan a replacement for wiki replicas that is better suited to typical OLAP use cases than the MediaWiki OLTP schema.

Just to add a perspective…

Nov 18 2020, 7:31 AM · cloud-services-team (Kanban), Data-Services, Analytics

Nov 14 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

Could you apply this diff

Done.

Nov 14 2020, 5:30 PM · Upstream, Wikisource, Tools
Xover updated subscribers of T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

… every word is on a new line. …

Same feedback as @Jan.Kamenicek tonight, although it seemed to worked great a week ago.

Nov 14 2020, 9:25 AM · Upstream, Wikisource, Tools
Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

@Xover, I think it is a misunderstanding
data.text.substring(0,5) != "<?xml" -> XML is accepted, if it is not XML, then is considered error.

Nov 14 2020, 9:13 AM · Upstream, Wikisource, Tools

Nov 13 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

…fallback to old OCR when got text is an error message instead of XML content:

function hocr_callback(data) {
	if ( data.error || data.text.substring(0,5)!="<?xml" ) {
Nov 13 2020, 9:07 AM · Upstream, Wikisource, Tools

Nov 12 2020

Xover updated subscribers of T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

Ok, I've now had some independent testing (Big big thank you to Jan!) that confirms the tweaked Gadget code now produces results that are at least within a reasonable distance of what it used to produce.

Nov 12 2020, 1:39 PM · Upstream, Wikisource, Tools

Nov 11 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

Ok, an update on the corrupted cache…

Nov 11 2020, 6:38 PM · Upstream, Wikisource, Tools
Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

… the [OCR] result is very poor, …: every word is on a new line.

This is a separate problem, and is most likely related to Tesseract being upgraded to 4.x.

Nov 11 2020, 7:28 AM · Upstream, Wikisource, Tools

Nov 10 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

Unfortunately, the OCR does not work with any of these at all

Nov 10 2020, 9:52 AM · Upstream, Wikisource, Tools

Nov 9 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

… I tested it now e. g. on Page:John_Huss,_his_life,_teachings_and_death,_after_five_hundred_years.pdf/122 and some other pages of the same book and it still does not work here :-(

Nov 9 2020, 8:39 PM · Upstream, Wikisource, Tools
Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

@Xover - What would be the effect of just deleting all the caches? Tesseract has been upgraded since most of those caches were generated anyway.

Nov 9 2020, 5:26 PM · Upstream, Wikisource, Tools
Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..
Nov 9 2020, 10:45 AM · Upstream, Wikisource, Tools

Nov 6 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

The cache for a given work will be in a subdirectory of ~/cache/hocr/ created from the MD5 hash of the file's name (spaces replaced with underscores) concatenated with the invoking project's language code. So for Mexico_under_Carranza.djvu requested from English Wikisource, you can generate the hash with…

Nov 6 2020, 9:57 PM · Upstream, Wikisource, Tools
Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

Ok, having gotten access to the project in connection with T265640 I've been trying to debug this a bit.

Nov 6 2020, 5:04 PM · Upstream, Wikisource, Tools

Nov 5 2020

Xover added a comment to T265640: phe-tools: Match&Split bot is not running because of python2 deprecation in pywikibot.

@JJMC89 Thanks!

Nov 5 2020, 7:42 AM · Tools

Nov 4 2020

Xover added a comment to T265640: phe-tools: Match&Split bot is not running because of python2 deprecation in pywikibot.

@Candalua Thanks!

Nov 4 2020, 8:32 PM · Tools
Xover added a comment to T265640: phe-tools: Match&Split bot is not running because of python2 deprecation in pywikibot.

@Candalua That leaves you as the only admin on phetools with any likelihood of having the spare cycles to look at this (Phe and Tpt are highly unlikely to be available for this any time soon). Any chance you could poke around here a bit?

Nov 4 2020, 9:17 AM · Tools

Nov 2 2020

Xover added a comment to T265640: phe-tools: Match&Split bot is not running because of python2 deprecation in pywikibot.

@Aklapper Indeed. Community-Tech was added as their Toolforge group account is one of the four accounts set as admin for the phetools Toolforge project.

Nov 2 2020, 3:32 PM · Tools

Nov 1 2020

Xover added a comment to T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.

… is this a challenge a lot of people are encountering?

Nov 1 2020, 2:02 PM · Editing-team, Community-Tech, VisualEditor, ProofreadPage
Xover merged T202200: Visual Editor set double header in ProofreadPage header into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:55 PM · Editing-team, Community-Tech, VisualEditor, ProofreadPage
Xover merged task T202200: Visual Editor set double header in ProofreadPage header into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:54 PM · ProofreadPage, VisualEditor
Xover merged T198688: Switching between editors on Wikisource, the header and footer are moved into the body into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:53 PM · Editing-team, Community-Tech, VisualEditor, ProofreadPage
Xover merged task T198688: Switching between editors on Wikisource, the header and footer are moved into the body into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:52 PM · VisualEditor, ProofreadPage
Xover merged T212347: Proofreading on Wikisource, switching editor from source to visual to source incorrectly moves header text into page body into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:52 PM · Editing-team, Community-Tech, VisualEditor, ProofreadPage
Xover merged task T212347: Proofreading on Wikisource, switching editor from source to visual to source incorrectly moves header text into page body into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:51 PM · ProofreadPage
Xover merged T266942: Visual Editor issue on Bengali Wikisource into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:49 PM · Editing-team, Community-Tech, VisualEditor, ProofreadPage
Xover merged task T266942: Visual Editor issue on Bengali Wikisource into T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Nov 1 2020, 1:48 PM · Bengali-Sites, ProofreadPage, Wikisource, VisualEditor

Oct 19 2020

Xover added a comment to T228594: [phetools] Wikisource OCR deletes old contents of a page, but does not generate new text..

@kaldari Nope, still seeing the same failure mode. It greys out the text in the editor and then throws an error in the JS console ala. An error occurred during ocr processing: /tmp/52004_6179/page_0199.tif.

Oct 19 2020, 8:19 PM · Upstream, Wikisource, Tools

Oct 16 2020

Xover added a comment to T265571: MediaWiki 1.36/wmf.13 needlessly HTML encodes ASCII characters in DjVu text layer.

Apparently the HTML entities are fixed automatically in the English Wikisource (when I try in this book). ~~~~

Oct 16 2020, 11:52 AM · MW-1.36-notes (1.36.0-wmf.14; 2020-10-20), ProofreadPage, Editing-team, MediaWiki-DjVu, Wikisource
Xover added a comment to T265704: Erroneous and broken HTML entities (such as "&#39 ;") displayed for certain characters in the ProofreadPage edit box.

This is a dup of T265571.

Oct 16 2020, 9:22 AM · ProofreadPage, Wikisource

Oct 15 2020

Xover created T265640: phe-tools: Match&Split bot is not running because of python2 deprecation in pywikibot.
Oct 15 2020, 5:45 PM · Tools
Xover added a comment to T265571: MediaWiki 1.36/wmf.13 needlessly HTML encodes ASCII characters in DjVu text layer.

Not having access to T263371 it's hard to say anything intelligent about the specific issue, but…

Oct 15 2020, 5:17 PM · MW-1.36-notes (1.36.0-wmf.14; 2020-10-20), ProofreadPage, Editing-team, MediaWiki-DjVu, Wikisource
Xover added a comment to T265571: MediaWiki 1.36/wmf.13 needlessly HTML encodes ASCII characters in DjVu text layer.

Hmm. If one API endpoint returns unencoded text, then lower-level components seem unlikely culprits. If one API returns encoded text, then higher-level components seem unlikely. IOW: those results suggest to me that this is happening at the API layer.

Oct 15 2020, 10:26 AM · MW-1.36-notes (1.36.0-wmf.14; 2020-10-20), ProofreadPage, Editing-team, MediaWiki-DjVu, Wikisource
Xover created T265571: MediaWiki 1.36/wmf.13 needlessly HTML encodes ASCII characters in DjVu text layer.
Oct 15 2020, 7:37 AM · MW-1.36-notes (1.36.0-wmf.14; 2020-10-20), ProofreadPage, Editing-team, MediaWiki-DjVu, Wikisource
Xover renamed T244657: Visual Editor moves ProofreadPage header / footer into page text field, duplicating them from Visual Editor moves ProofreadPage haeader / footer into page text field, duplicating them to Visual Editor moves ProofreadPage header / footer into page text field, duplicating them.
Oct 15 2020, 7:25 AM · Editing-team, Community-Tech, VisualEditor, ProofreadPage

Oct 4 2020

Xover added a comment to T256086: Deprecate SkinMinervaDefaultModules hook.

… That isn't what this task is about... It's about deprecating a hook that specifically allows the proofreadpage extension to disable the javascript-based mobile editor. …

Oct 4 2020, 9:55 AM · MW-1.36-notes (1.36.0-wmf.16; 2020-11-03), good first task, patch-welcome, Readers-Web-Backlog, MinervaNeue, MediaWiki-Core-Skin-Architecture
Xover added a comment to T255345: Proofreadpage Pages should have the associated pagelist's page number embedded in them.

I'm not really sure if having the header and footer will help given that they will probably be in wikitext format and extracting it from the html would be easier than attempting to parse the wikitext.

Oct 4 2020, 9:04 AM · MW-1.36-notes (1.36.0-wmf.12; 2020-10-05; NEVER DEPLOYED), ProofreadPage
Xover added a comment to T256086: Deprecate SkinMinervaDefaultModules hook.

… This would mean that, if no additional changes are made, wikis that use the ProofReadPage extension would not be able to disable the editor. Instead, all wikis with the extension would have the same, standard editor?

Oct 4 2020, 8:40 AM · MW-1.36-notes (1.36.0-wmf.16; 2020-11-03), good first task, patch-welcome, Readers-Web-Backlog, MinervaNeue, MediaWiki-Core-Skin-Architecture

Sep 4 2020

Xover added a comment to T261023: Explore content moderation issues.

I didn't see the "Has somebody already looked at this edit?" aspect (i.e. mark as patrolled) in the above list. But maybe that's included in one of the existing bullet points?

Sep 4 2020, 2:06 PM · Developer-Advocacy (Oct-Dec 2020), User-bd808, Toolhub

Aug 15 2020

Xover added a comment to T260211: ProofreadPage page body template.

@kamholz I don't understand what it is you're proposing to do here, nor see how it will have applicability outside just Balinese content. From whence comes #transliterate and what does it do? Why hard-code <br> inside ProofreadPage and provide two copies of the text? Why can this not be done with a normal template?

Aug 15 2020, 8:03 AM · Patch-For-Review, ProofreadPage

Aug 8 2020

Xover updated subscribers of T259963: Multiple Index: and Page: wikipages for a single File:.

@kamholz You may possibly be interested in this, and there is some overlap with T259645.

Aug 8 2020, 9:42 AM · ProofreadPage, Wikisource
Xover created T259963: Multiple Index: and Page: wikipages for a single File:.
Aug 8 2020, 9:38 AM · ProofreadPage, Wikisource

Aug 5 2020

Xover added a comment to T259645: ProofreadPage should recognize language specification in Index.

You have a good point about transclusion. I haven't looked into that yet, but presumably the pagelang should be applied there too?

Aug 5 2020, 12:44 PM · MW-1.36-notes (1.36.0-wmf.5; 2020-08-18), ProofreadPage
Xover added a comment to T259645: ProofreadPage should recognize language specification in Index.

How would this interact with the ability to set the language for a single Page: page? How about mainspace transclusions of Page pages?

Aug 5 2020, 8:57 AM · MW-1.36-notes (1.36.0-wmf.5; 2020-08-18), ProofreadPage

Aug 3 2020

Xover added projects to T192866: Some DjVu files have too much metadata to fit in their database column: Structured Data Engineering, Multi-Content-Revisions.
Aug 3 2020, 2:03 PM · Structured-Data-Backlog, Multi-Content-Revisions, Structured Data Engineering, User-TheDJ, MediaWiki-File-management, Wikimedia-production-error, MediaWiki-DjVu, Multimedia, Commons
Xover updated subscribers of T192866: Some DjVu files have too much metadata to fit in their database column.

@Aklapper Thanks, but as @TheDJ notes, going by that overview nobody owns multimedia features in WMF wikis now. That's a pretty sad state of affairs given how central multimedia is for almost all the projects (including Wikidata and whatever "Abstract" will end up as).

Aug 3 2020, 2:00 PM · Structured-Data-Backlog, Multi-Content-Revisions, Structured Data Engineering, User-TheDJ, MediaWiki-File-management, Wikimedia-production-error, MediaWiki-DjVu, Multimedia, Commons

Jul 30 2020

Xover added a comment to T192866: Some DjVu files have too much metadata to fit in their database column.

Just to note: Multimedia was tagged as a key player here, but that Phab team seems to have been archived. Who owns the components previously in that group's remit now?

Jul 30 2020, 2:44 PM · Structured-Data-Backlog, Multi-Content-Revisions, Structured Data Engineering, User-TheDJ, MediaWiki-File-management, Wikimedia-production-error, MediaWiki-DjVu, Multimedia, Commons

Jul 23 2020

Xover updated subscribers of T258666: RevisionAccessException when trying to import files with FileImporter.

… the import code will try to load the previous revision immediately after creating it. …

Jul 23 2020, 4:18 PM · MW-1.36-notes (1.36.0-wmf.2; 2020-07-28), Patch-For-Review, MW-1.35-notes, WMDE-QWERTY-Sprint-2020-07-22, Platform Team Workboards (Clinic Duty Team), Internet-Archive, Move-Files-To-Commons, Wikimedia-production-error
Xover added a comment to T258666: RevisionAccessException when trying to import files with FileImporter.

Presuming T212428 is an API race condition triggered by replication lag, this seems to be a different issue.

Jul 23 2020, 4:02 PM · MW-1.36-notes (1.36.0-wmf.2; 2020-07-28), Patch-For-Review, MW-1.35-notes, WMDE-QWERTY-Sprint-2020-07-22, Platform Team Workboards (Clinic Duty Team), Internet-Archive, Move-Files-To-Commons, Wikimedia-production-error
Xover added a comment to T212428: includes/Revision/RevisionStore.php: Main slot of revision (number) not found in database!.

Note that T258666 looks like it could be an expression of this problem that was exacerbated by MediaWiki 1.36/wmf.1 (which is, I think, scheduled to hit the Wikipedias today). And as I commented there: I've never before seen this problem, and I've now seen 3 reports from 3 different projects in the last 24 hours. If so, this is not just a log spamming problem with the occasional user-visible weirdness any more.

Jul 23 2020, 3:49 PM · affects-translatewiki.net, Platform Team Workboards (Clinic Duty Team), User-brennen, Platform Team Initiatives (MCR), MW-1.33-notes (1.33.0-wmf.23; 2019-03-26), MediaWiki-Revision-backend, Multi-Content-Revisions (Reactive), Wikimedia-production-error
Xover added a comment to T255981: Persistant error 500 getting category members.

The FileImporter issue is apparently T258666 and not obviously related.

Jul 23 2020, 3:28 PM · Platform Team Workboards (Clinic Duty Team), Release-Engineering-Team, Upstream, Commons, ApiFeatureUsage, Pywikibot
Xover added a comment to T258666: RevisionAccessException when trying to import files with FileImporter.

I'd never seen the issue from T212428 (and I use FileImporter a lot), but in the last 24 hours I've seen this issue reported from enWS, jpWP, and by a Chinese user (not sure which project they call home) on Commons. If it's not actually more prevalent after deploying MediaWiki 1.36/wmf.1 that's a heck of a coincidence!

Jul 23 2020, 3:27 PM · MW-1.36-notes (1.36.0-wmf.2; 2020-07-28), Patch-For-Review, MW-1.35-notes, WMDE-QWERTY-Sprint-2020-07-22, Platform Team Workboards (Clinic Duty Team), Internet-Archive, Move-Files-To-Commons, Wikimedia-production-error