Page MenuHomePhabricator
Feed Advanced Search

Jul 20 2020

Dominicbm added a comment to T258122: *.archives.gov in wgCopyUploadsDomains allowlist doesn't work as URLs are HTTP 302 redirects to s3.amazonaws.com.

I'm afraid this is hard to solve... IIRC it's impossible to whitelist something smaller than a domain, and we're definitely not going to whitelist Amazon - that would be equivalent to whitelisting anything.

Jul 20 2020, 1:32 PM · MW-1.36-notes (1.36.0-wmf.10; 2020-09-22), Platform Team Workboards (Clinic Duty Team), Commons, Wikimedia-Site-requests

Jul 16 2020

Dominicbm created T258122: *.archives.gov in wgCopyUploadsDomains allowlist doesn't work as URLs are HTTP 302 redirects to s3.amazonaws.com.
Jul 16 2020, 1:34 AM · MW-1.36-notes (1.36.0-wmf.10; 2020-09-22), Platform Team Workboards (Clinic Duty Team), Commons, Wikimedia-Site-requests

May 28 2020

Dominicbm added a comment to T253591: page generators can truncate responses when there is excessive metadata (e.g. DjVu/PDF files).

Update: using the default command, the generator takes a 2-3 hours to complete for this large category. Putting step at 1 and it was still unfinished 2 days later when I killed it, so that wasn't really feasible. I can experiment with other lower increments between 1 and 50, but I already know enough to know the only values low enough to prevent truncation and lost pages will make the operation take too long to complete, since 50 was already very slow and still getting truncated.

May 28 2020, 7:12 PM · Pywikibot-Commons, Pywikibot

May 26 2020

Dominicbm added a comment to T253591: page generators can truncate responses when there is excessive metadata (e.g. DjVu/PDF files).

The problem is that, for my use case, I really just want page titles, but I guess since Pywikibot wants to generate all the page objects using all the metadata, there is no way around this error currently.

CategorizedPageGenerator does not load the content by default. To get it you have to use content=False as parameter or use an explicit cat.get() later.

May 26 2020, 4:21 PM · Pywikibot-Commons, Pywikibot
Dominicbm added a comment to T253591: page generators can truncate responses when there is excessive metadata (e.g. DjVu/PDF files).
May 26 2020, 3:36 PM · Pywikibot-Commons, Pywikibot

May 25 2020

Dominicbm created T253591: page generators can truncate responses when there is excessive metadata (e.g. DjVu/PDF files).
May 25 2020, 9:14 PM · Pywikibot-Commons, Pywikibot
Pharos awarded T121912: Better redirect handling for pageview API a Like token.
May 25 2020, 5:45 PM · Analytics
Dominicbm added a comment to T129216: Pywikibot should support async chunked uploading.

In attempting to upload to Wikimedia Commons using chunked upload, I receive a whole series of warnings and errors, including
(1) "WARNING: Unexpected offset.", (2) a large traceback from a read timeout (T253236), (3) series of internal_api_error_DBQueryError and retries, and then finally (4) stashfailed error that fails the upload attempt.

May 25 2020, 1:33 PM · Patch-Needs-Improvement, Pywikibot-Commons, Pywikibot-General, Pywikibot
Dominicbm created P11299 Pywikibot chunked upload.
May 25 2020, 1:21 PM

May 22 2020

Dominicbm added a comment to T216015: Massive image upload with pywikibot.

You can upload at a faster rate with a bot account. See https://commons.wikimedia.org/wiki/Commons:Bots

May 22 2020, 5:18 PM · Pywikibot

May 21 2020

Dominicbm added a comment to T248238: PAWS 504 errors.

I think it happened for many hours, or maybe more than a day, but is not an ongoing issue.

May 21 2020, 10:59 PM · PAWS
Dominicbm created T253355: "The required parameter \"$1\" was missing." for wbgetclaims on Commons.
May 21 2020, 10:44 PM · Wikidata-Campsite, Wikidata

May 20 2020

Dominicbm created T253236: Connection timeout errors generate massive traceback from underlying python libraries.
May 20 2020, 3:33 PM · good first task, Pywikibot-network, Pywikibot
Dominicbm created T253180: "ratelimited" API error does not throttle for rate limits.
May 20 2020, 1:05 AM · Pywikibot

May 7 2020

Dominicbm added a comment to T252079: mw.wikibase.getLabelByLang('Q1','en') returning nil today.

As @WilliamGraham notes, this is actually breaking some categorization on a large scale, which may take a long time to repopulate once they have been depopulated. It's not just about infoboxes on categories, though. If I am understanding right, this is presumably affecting millions of pages on Wikimedia Commons, considering how widespread the use of templates like Template:Artwork, Template:Institution, Template:Creator in image metadata. It might be used a dozen times in a single file page alone, such as for an artist name, genre, medium, institution, and even the labels for basic terms like "height", "width", "collection", "genre", etc. displayed by the template are pulled in from Wikidata.

May 7 2020, 2:36 AM · MW-1.35-notes (1.35.0-wmf.32; 2020-05-12), Wikibase - Federated Properties (Sprint 5/6 (Hike 1)), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), Wikidata, Wikibase-Lua, MediaWiki-extensions-WikibaseClient

May 6 2020

Dominicbm updated subscribers of T249597: Assess approximate total upload size for DPLA projects.

It's a bit hard to estimate total size in bytes, especially because the total is a moving target, based on how many partners sign on. @SandraF_WMF I recently gave a webinar on DPLA's work (https://youtu.be/0BSoKSYBcBI). Basically, we are now actively doing outreach to our partners (essentially, and US-based GLAM), and doing uploads on-request for partners that have appropriate rights and can give us the media. There are 37 million items in DPLA's dataset, of which 1.7 million are currently licensed appropriately for Commons upload. An item can have any number of media files, and we don't really know the aggregate number (or byte size), because we don't host them—we're just working with the partners to download the assets they host and reupload them to Commons.

May 6 2020, 4:24 PM · Commons

Mar 21 2020

Dominicbm created T248238: PAWS 504 errors.
Mar 21 2020, 5:45 PM · PAWS

Mar 20 2020

Dominicbm added a comment to T248151: Big number of uploads from DPLA bot.

Hi, this is me! 😳If it's easier, I can get on Telegram or IRC to chat with you about my project. Obviously, I've been going at a high rate, but I don't really want to break Wikimedia!

Mar 20 2020, 12:56 PM · User-fgiunchedi, Operations, SRE-swift-storage, Commons

Jan 8 2020

Spinster awarded T242225: Document input format for all data types for wbcreateclaim, wbsetclaim, wbsetclaimvalue, etc. somewhere a Like token.
Jan 8 2020, 4:40 PM · Structured-Data-Backlog, Documentation, StructuredDataOnCommons, Wikidata
Dominicbm created T242225: Document input format for all data types for wbcreateclaim, wbsetclaim, wbsetclaimvalue, etc. somewhere.
Jan 8 2020, 2:33 PM · Structured-Data-Backlog, Documentation, StructuredDataOnCommons, Wikidata

Jan 7 2020

Dominicbm added a comment to T224214: Allow structured data to be added via API:Upload.

UploadWizard aside, this functionality would be also useful in particular for users relying on the API to do bulk uploads, since it would allow users to upload a file with structured data attached—in the same way as the unstructured file description is currently included in the same request as the media upload. This would reduce the number of API requests necessary to achieve the task of adding structured data statements for new uploads.

Jan 7 2020, 7:43 PM · Structured-Data-Backlog, Structured Data Engineering

Oct 22 2019

Dominicbm added a comment to T188684: PAWS kills active users servers that are not connected to a user session.

I would like to give this a bump. If all of the rationales provided in https://www.mediawiki.org/wiki/PAWS#Why%3F are to believed, then simply telling users to use Toolforge for any task that might need to run for more than an hour isn't the solution. I have made hundreds of thousands of edits via PAWS, using PWB, and it would be an incredible improvement to experience, as a user, to not have to worry about keeping a browser session connected.

Oct 22 2019, 6:22 PM · WMSE-Tools-for-Partnerships-2019-Blueprinting, PAWS, Upstream

Nov 26 2018

Restricted Application added a project to T62257: VisualEditor: Add a shortcut for strikethrough (?which?): User-Ryasmeen.

@Jdforrester-WMF Since macOS Mojave, Shift-Command-5 is now used for an additional screenshot option, which conflicts with strikethrough. Is it possible to reconfigure? https://support.apple.com/en-us/HT201236

Nov 26 2018, 2:15 PM · User-Ryasmeen, VisualEditor pre-2015 work, VisualEditor, Verified, VisualEditor-EditingTools, VE-deploy-2014-09-18

Nov 24 2018

Krinkle awarded T121912: Better redirect handling for pageview API a Orange Medal token.
Nov 24 2018, 12:54 AM · Analytics

Sep 13 2018

MusikAnimal awarded T121912: Better redirect handling for pageview API a Like token.
Sep 13 2018, 3:57 PM · Analytics

Sep 7 2017

Dominicbm created T175289: No way to clear/edit very large watchlists.
Sep 7 2017, 5:27 PM

Jun 9 2016

Dominicbm created T137423: Please add nara.gov to the wgCopyUploadsDomains whitelist of Wikimedia Commons.
Jun 9 2016, 2:18 PM · Patch-For-Review, Commons, Wikimedia-Site-requests

Jan 20 2016

Dominicbm added a comment to T124080: Please add archives.gov to $wgCopyUploadsDomains.

Thanks @Dereckson. I linked to the catalog record, which does have tiled images for dynamic zoom in the UI. However, the download link (and media URLs in API outputs) contains a regular media file (e.g. https://catalog.archives.gov/OpaAPI/media/299685/content/arcmedia/media/images/40/7/40-0678a.gif).

Jan 20 2016, 5:59 PM · Patch-For-Review, Wikimedia-Site-requests, Commons

Jan 19 2016

Dominicbm created T124080: Please add archives.gov to $wgCopyUploadsDomains.
Jan 19 2016, 7:39 PM · Patch-For-Review, Wikimedia-Site-requests, Commons

Dec 18 2015

Dominicbm created T121912: Better redirect handling for pageview API.
Dec 18 2015, 9:34 PM · Analytics