Mon, Mar 23
This does not yield a number that I would expect; for a language like Khowar (khw), which is not updated frequently (or at all?) on Wikidata, I obtained a number close to 270k when searching "haslabel:khw" today, which compared to the numbers in this table for that language doesn't make much sense.
I have a query, https://quarry.wmflabs.org/query/18763, that returns label/description/alias statistics for a set of languages (also mentioned in T197161#4300061). This was derived from a query used by @Pasleim to update this table (which worked until the end of May) and its South Asian counterpart (which worked up until wb_terms updates got turned off). I tried to rewrite the Quarry query to use the new databases (https://quarry.wmflabs.org/query/41692), but running this has not yet succeeded, either using Quarry or directly on tools-login. I am not sure whether this rewritten query can be simplified beyond what's written at present, so I was hoping there might be a better way of obtaining these statistics via SQL that does not presently time out.
Fri, Mar 20
The use of the API works up to a point. I am noticing that for GeoJSON files at or above 250KB I'm getting read timeouts when using Pywikibot. Any way to get past those errors?
Jan 14 2020
Jan 7 2020
Dec 30 2019
Dec 16 2019
Nov 5 2019
So somehow @Sic19 managed to circumvent the size limit check with "Data:Canada/Nunavut.map" (I suppose AWB is old enough that it's not sensitive to the workings of tabular data). Is there a way to make the size limit check apply to the compacted data?
Oct 26 2019
As a note, this is desperately needed for a number of Indian languages (including Bengali) in which the standardization of spellings according to one standard or another does not preclude other spellings from being considered acceptable (even in the absence of some characteristic unifying the alternative spellings).
Sep 14 2019
Sep 13 2019
Aug 25 2019
@StevenJ81 what are your thoughts on this request?
Aug 10 2019
32 months later, @Yurik, what's the status regarding implementing a new storage architecture for datasets (assuming that a stopgap measure such as uploading JSON in compact formats is somehow not tenable)? T200968 has officially opened up the floodgates for the upload of larger datasets, but there is still the issue, even when one does split up the data into discrete chunks, of overshooting this 2MB limit. Take, for example, the boundaries of https://www.wikidata.org/wiki/Q338425 : how does one properly split the data up into small chunks when the borders of its constituent elements (which I'm sure people would upload separately if those constituent element borders formed a partition (set theory) of the district) are not known?
Jul 14 2019
Jul 2 2019
Jun 21 2019
The bug, if there is any, is in GeoNames or in wherever else @Lsj's bot obtains its geographic information. Ultimately it must be corrected there or there is still a risk of reintroducing it to Wikidata.
To present a more concrete use case for such functionality, @debt, the infoboxes on articles for localities in India have slots listing the national/state-level parliamentary constituencies that they're located in along with the current representatives in those constituencies. These slots, which at one point were prefilled from Wikidata, no longer work since the misuses of P585 ("point in time") on the Wikidata items for those constituencies that allowed such information to be present have been removed. In the absence of a property linking a Wikidata item about a constituency to an election involving that constituency (to use the example of a national-level constituency in Kolkata, from Q3348171 to Q63988950), it would be helpful to obtain a list of items which link back to an item for a constituency via something akin to haswbstatement (in this case, to continue using the previous example, akin to haswbstatement:P1001=Q3348171) so that the most recent election information (from electorate size to successful candidates to numbers of spoilt votes, to name a few facts) could thus be obtained with a few more steps.
May 22 2019
May 1 2019
Apr 30 2019
Apr 21 2019
@ArielGlenn It appears that particle physics is a massively collaborative enterprise, so that the results presented in a single paper can have thousands of people behind them, all of whom are credited (hence the particularly large revision size).
Apr 16 2019
Apr 15 2019
Apr 13 2019
Mar 21 2019
Mar 16 2019
https://quarry.wmflabs.org/query/28286 lists all pages in the Page: namespace below 500 bytes, in ascending order of page length (so that the shortest pages in the Page: namespace show up first). The commented lines, if uncommented, will list those pages that have already been proofread but not validated--this is presently based on links to the category "Proofread" rather than page properties, but I'm sure this can be fixed easily.
Mar 14 2019
While I believe it is possible to get a list of such pages via Quarry queries, being able to view these within the special pages themselves would be quite helpful (and not just for the two aforementioned namespaces, and not just for Wikisources either).
Feb 19 2019
I made the circular element for WikiProject India's regular logo (https://commons.wikimedia.org/wiki/File:WikiProject_India_bars.svg) as a substitute for the chakra in the center of the Indian flag, and not initially in the interest of having a logo fit perfectly inside a circular frame. (The logo which is present on the account to which David linked is meant to represent a datathon running from the 21st to the 24th and will be substituted with the regular logo at the datathon's conclusion.)
Feb 18 2019
That looks great!
Feb 17 2019
Is the language that odd? https://en.wikipedia.org/wiki/Okinawan_language
Feb 16 2019
Feb 15 2019
Feb 5 2019
Ideas regarding the open questions in the task description:
Feb 1 2019
Jan 26 2019
Concern has been raised about semi-protection of these items here.
Jan 8 2019
I suggested separate codes for separate scripts in this instance because of a situation in another language (Meitei) which is frequently used with two different scripts. While the CLDR indicates that the language exclusively uses the Eastern Nagari script, a contributor to Wikimedia projects (User:Awangba Mangang) has been providing localization exclusively in the Meetei Mayek script.
Nov 24 2018
I would recommend adding separate codes for ccp-beng and ccp-cakm since both writing systems (Eastern Nagari and Ajhapath) have been used to write the language.
Nov 18 2018
Oct 18 2018
Oct 17 2018
Aug 18 2018
Aug 2 2018
Jul 27 2018
Jul 23 2018
I have noticed this as well when I need to form a relatively small number of QuickStatements queries in a non-Bengali language (i.e. not in my interface language) and I have to remove "ইংরেজি" or some other language name from the text file I'm using as my canvas after ripping the results from Special:Search.
Jul 22 2018
Jul 19 2018
In related developments, we now have a LilyPond notation property. Here's hoping that it is put to good use.
Jul 18 2018
Jul 12 2018
I figured as much about sysops and bots being automatically patrolled. Sysops should be able to grant the autopatrol flag.
Jul 11 2018
Thank you for your comments @Ebe123; I will see if I can get a MediaWiki instance running on my laptop so I can examine the extension myself. I have noted your points regarding new bug reports.
Jul 9 2018
@Zoranzoki21: Yes, the Publisher: namespace should also be searched by default, in the same way that the main, Author:, Portal:, and Translation: namespaces are searched by default (or at least that's what an advanced search sets by default). Also note the comment I made on your patch.
Hi @Jonas: I blocked that particular range, among others allocated to Telefonica Germany, in an attempt to enforce this global ban. @Multichill and @MisterSynergy can tell you more about what has been going on. If you'd like me to lift that particular range block, I'd be happy to do this, but please be aware of what you may be allowing to happen.
Jul 7 2018
@Lydia_Pintscher (See the larger of the excerpts from my proposal on Wikidata:Project chat that Jc86035 included in the task description.) We initially stored population data as individual values on an item, but later began storing it as links to Commons tabular data (properties 1082 and 4179 respectively). We also store isolated mathematical expressions, not full derivations or proofs, as LaTeX snippets, and it would not surprise me if someone came around and proposed a property for links to derivations and proofs hosted on Commons (whether as raw LaTeX or otherwise). It would therefore make sense to initially store as statements musical snippets that fit in the 400-character limit for strings, and then later once a MuseScore extension for MediaWiki comes to fruition (or if Extension:Score is significantly revamped) link to full sheet music on Commons. The specific examples on the property proposal that Jc86035 created should not cause copyright problems, and I personally would not have a problem if we need to restrict such a property's use on items for copyrighted music.
Jul 3 2018
Jul 2 2018
Any reason why the language code for this isn't 'es-x-Q653884' or something similar, in the same way language codes are being devised for lexemes at present?
Jun 29 2018
@WMDE-leszek: no, I also have used wb_terms to clean up descriptions and aliases as well; for example, I will most likely find some time to move gom entries to gom-latn or gom-deva as appropriate (similar things are possible for ks, crh, gan, ku, ruq, shi, ug, and with some difficulty sr, tg, tt, and kk). (Apologies for the delayed reply.)
Jun 28 2018
Jun 19 2018
I do not operate any specific tool that uses wb_terms—instead opting to write simple queries—but there are a couple things I have found useful about that table:
May 24 2018
The language could be a superscript, as it is when labels for items in Special:Search are presented in a language different from the interface language, and the form could be a subscript. Borrowing Nicolas's example, L62 could be presented as "gwez <sub>noun</sub> <sup>Breton</sup> (L62)".
May 10 2018
At present there are the following, stripped of the "Index:" prefix, that appear in only one of the lists Andre linked to: