Feed Advanced Search

Advanced Search
Use Results
Edit Query
Hide Query

	Include stories about projects I am a member of.

Dec 21 2022

Aklapper renamed AGutman from AGutman-WMF to AGutman.

Dec 21 2022, 1:54 PM

Nov 17 2022

AGutman added a comment to T320263: [Wikidata] Provide a feature to link Item labels to Lexemes.

There is a discussion of my stop-gap solution on the item where I added the literal translation property: https://www.wikidata.org/wiki/Talk:Q467#Lexemes

Nov 17 2022, 9:50 AM · Wikidata, Abstract Wikipedia team

Oct 25 2022

AGutman added a comment to T320263: [Wikidata] Provide a feature to link Item labels to Lexemes.

As a stop gap solution, I'm suggesting we use the literal translation property to link items to senses. As an example of its usage, I've linked Q467 to Hebrew L63925. This seems to work well for cases where the item corresponds to a single lexeme (sense).

Oct 25 2022, 2:12 PM · Wikidata, Abstract Wikipedia team

Oct 13 2022

AGutman added a comment to T320263: [Wikidata] Provide a feature to link Item labels to Lexemes.

Yes, you're right it makes more sense to link to a sense.

Oct 13 2022, 4:01 PM · Wikidata, Abstract Wikipedia team

Oct 11 2022

AGutman added a comment to T320263: [Wikidata] Provide a feature to link Item labels to Lexemes.

As said, both issues can be solved. The issue is that, as currently construed, the labels/descriptions are not really machine-readable: currently they are usable mostly for human consumption.

Oct 11 2022, 4:13 PM · Wikidata, Abstract Wikipedia team

AGutman added a comment to T320263: [Wikidata] Provide a feature to link Item labels to Lexemes.

Do you mean that will clutter the UI or the database itself? If the former, this can be solved by selectively showing these link in the UI. If you refer to cluttering the database itself - I agree this would require extra capacity, but I don't think it is unmanageable.

Oct 11 2022, 2:28 PM · Wikidata, Abstract Wikipedia team

Oct 7 2022

AGutman renamed T320263: [Wikidata] Provide a feature to link Item labels to Lexemes from [Wikidata] Linking items labels to lexemes to [Wikidata] Linking Item labels to Lexemes.

Oct 7 2022, 2:36 PM · Wikidata, Abstract Wikipedia team

AGutman created T320263: [Wikidata] Provide a feature to link Item labels to Lexemes.

Oct 7 2022, 2:35 PM · Wikidata, Abstract Wikipedia team

Oct 4 2022

AGutman added a comment to T317193: Add language codes for isiNdebele.

Ok, who is responsible for this approval? Could we ping them?

Oct 4 2022, 12:14 PM · Abstract Wikipedia team, Abstract Wikipedia NLG, MW-1.40-notes (1.40.0-wmf.6; 2022-10-17), Wikidata, Language codes

Oct 3 2022

AGutman added a comment to T317193: Add language codes for isiNdebele.

Can we go forward with nd & nr codes?

Oct 3 2022, 12:21 PM · Abstract Wikipedia team, Abstract Wikipedia NLG, MW-1.40-notes (1.40.0-wmf.6; 2022-10-17), Wikidata, Language codes

Sep 7 2022

AGutman claimed T317193: Add language codes for isiNdebele.

I'm taking care of the Ndebele language codes (nd & nr) in https://gerrit.wikimedia.org/r/828887.

Sep 7 2022, 3:14 PM · Abstract Wikipedia team, Abstract Wikipedia NLG, MW-1.40-notes (1.40.0-wmf.6; 2022-10-17), Wikidata, Language codes

AGutman claimed T307820: Prototype Abstract Wikipedia in Scribunto.

A prototype has now been created in https://meta.wikimedia.org/wiki/Module:Sandbox/AbstractWikipedia.

Sep 7 2022, 12:31 PM · Abstract Wikipedia team (Phase θ – Throttling), 2022 Wikimedia Google.org Fellowship

Sep 1 2022

AGutman added a comment to T289776: Enable all ISO 639-3 codes on Wikidata.

I would like to support here the idea to add all the language codes of ISO 639-3 to be supported by Wikidata (and Abstract Wikipedia). Notwithstanding @mrephabricator's comments, this standard is the de-facto used standard to enumerate all the world's languages, and the Ethnologue, on which the stanard is based, is generally accepted as a scientifically solid resource (even though it may contain some errors). The ideological background of SIL International is in my opinion irrelevant, but I must note that the claim that they have no linguistic background is completely false. In fact, this organization has conducted extensive linguistic fieldwork in numerous parts of the world, and many of its members are trained linguists, the most famous one being Kenneth Pike.

Sep 1 2022, 10:41 AM · Patch-For-Review, Abstract Wikipedia team, Wikidata, Language-Team (Language-2022-January-March), Language codes

Aug 10 2022

AGutman added a comment to T307820: Prototype Abstract Wikipedia in Scribunto.

I agree with @ori it's worth the while to attempt a Lua prototype of this.
This raises however some design questions:

Where would the NLG templates be stored? Would they exist as special pages within Wikipedia (as Wikitext templates do, AFAIU)?
Would the NLG templates be compiled into Lua code at authoring time, or will they be interpreted by a Lua parser on the go? This affects the question of how calls to sub-templates should be handled - as normal function calls or as templates which need special parsing.
In general, how one would go ahead and execute the functions embedded within template slots? The most straightforward possibility is to use Lua's loadstring function, however, this is currently disabled in Scribunto. Also, this would allow running any arbitrary Lua code in a template slot, which is arguably too much. Another option is to parse the functional expressions in the slots and call the functions through the environment variable _G.

Aug 10 2022, 2:04 PM · Abstract Wikipedia team (Phase θ – Throttling), 2022 Wikimedia Google.org Fellowship

Jul 25 2022

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

@Asaf Insofar two forms are considered distinct lexemes, it is probably the case that not all statements hold for both forms (e.g. the pronunciation may be different, and possibly other details such as etymology). If the two forms are close enough (e.g. just minor dialectal pronunciation details), then we may indeed lump them together in one lexeme as if there were spelling variants (and then my suggested patch may become relevant). Even if we decide to split them, we may of course link the two lexemes to each other, using various properties such as "synonym of" or "derived from" etc. Anyhow, my suggested patch would allow more easily to lump together such variants, as it allows re-using the same basic language code for several spelling variants.

Jul 25 2022, 12:48 PM · Wikidata Lexicographical data, Wikidata

Jul 22 2022

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

@LucasWerkmeister I agree with you that if two variants have two different pronunciation, they should probably be split into two different lexemes (in general, I think we should avoid having multiple forms with the same grammatical features within one lexeme). There is some leeway, however, in this rule, since different dialects may have slightly different pronunciations which we still want to group into a single lexeme/form. For instance American English "color" and British English "colour" are in fact pronounced slightly differently, but it would be over-kill to split them, since the difference in pronunciation is systematic between the dialects.

Jul 22 2022, 12:34 PM · Wikidata Lexicographical data, Wikidata

Jul 12 2022

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

I believe the current situation, where multiple forms are added to account for spelling variations goes against the spirit of the lexicographical data model, and in particular the idea that there should be exactly one form for each combination of grammatical features. Therefore I think it is important to unblock this situation, and I think my proposal is a simple way to go forward.

Jul 12 2022, 3:34 PM · Wikidata Lexicographical data, Wikidata

Jun 30 2022

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

I've now created a patch that does allow associating several spelling variants with the same private language code.

Jun 30 2022, 12:24 PM · Wikidata Lexicographical data, Wikidata

Jun 24 2022

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

@Fnielsen as far as I see, each variant spelling forms its own set of inflected forms, so you have a paradigm related to mørklægge and another paradigm related to the variant spelling mørkelægge. So conceptually you don't have a single list of forms, but rather two distinct lists of forms. For this reason (and since the pronunciation slightly differs) it may make sense to separate them to two distinct lexemes.

Jun 24 2022, 2:45 PM · Wikidata Lexicographical data, Wikidata

AGutman changed the status of T236593: Cannot enter multiple forms for the same language variant from Open to In Progress.

I'm working on a patch to allow multiple forms associated with the same private language code.

Jun 24 2022, 2:15 PM · Wikidata Lexicographical data, Wikidata

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

@Fnielsen given that the pronunciation of these forms is in fact different (according to the X-Sampa notation), and each has its own distinct inflection set, I would treat these as two distinct (synonymous) lexemes. I don't see the advantage of lumping all these forms in one entry. Of course, in a dictionary intended for human-consumption it is convenient to list them together, but in a machine-readable dictionary, such as Wikidata, these should really be treated as two distinct lexemes.

Jun 24 2022, 2:13 PM · Wikidata Lexicographical data, Wikidata

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

@mxn If these are purely orthographic variants (i.e. the pronunciation is the same) I would list them under a single lexeme. And in that case, the most natural way would be to list them as spelling variants rather than distinct forms.

Jun 24 2022, 11:05 AM · Wikidata Lexicographical data, Wikidata

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

In T236593#8016636, @Fnielsen wrote:

In Danish, we are currently using multiple forms and linking them with https://www.wikidata.org/wiki/Property:P8530 See also the discussion at https://www.wikidata.org/wiki/Wikidata:Property_proposal/Alternative_form

Jun 24 2022, 7:39 AM · Wikidata Lexicographical data, Wikidata

Jun 21 2022

AGutman added a comment to T236593: Cannot enter multiple forms for the same language variant.

The ideal solution would be to allow (in the language code validator) arbitrary language codes including a rank identifier. For instance, for Viatnamese one should be able to use codes such as vi-x-Q8201-1, vi-x-Q8201-2 etc. Currently this doesn't pass the validation as one gets the error Invalid Item ID "Q8201-1".

Jun 21 2022, 9:23 AM · Wikidata Lexicographical data, Wikidata