Page MenuHomePhabricator

ArthurPSmith (Arthur Smith)
User

Projects (9)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Oct 5 2015, 2:29 PM (263 w, 3 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
ArthurPSmith [ Global Accounts ]

Recent Activity

Mon, Oct 12

ArthurPSmith added a comment to T258901: Generic viewer for ZObjects.

Change 633560 in gerrit has this adjustment - I'm not sure how to link it to this ticket though?

Mon, Oct 12, 3:52 PM · Patch-For-Review, Abstract Wikipedia (Phase β)

Thu, Oct 8

ArthurPSmith updated subscribers of T265034: Wikilambda API help page broken.
Thu, Oct 8, 12:18 PM · Abstract Wikipedia (Phase β)
ArthurPSmith created T265034: Wikilambda API help page broken.
Thu, Oct 8, 12:18 PM · Abstract Wikipedia (Phase β)

Wed, Oct 7

ArthurPSmith added a comment to T258901: Generic viewer for ZObjects.

From stand-up today I'll look at modifying the Vue components to allow a "view" mode as opposed to edit mode.

Wed, Oct 7, 8:10 PM · Patch-For-Review, Abstract Wikipedia (Phase β)
ArthurPSmith claimed T258901: Generic viewer for ZObjects.
Wed, Oct 7, 8:09 PM · Patch-For-Review, Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T264827: Z2K2 currently expects string per default.

I think this should be dealt with by having the default value for Z2K2 be null, not '' (an empty string). Alternatively, just remove the Z2K2 from the default Persistent ZObject.

Wed, Oct 7, 1:28 PM · Abstract Wikipedia (Phase λ)
ArthurPSmith added a comment to T263956: Work out how Lucas was able to save a Z6 as a Z2 and stop it from happening again.

Hi Lucas - the problem is a change in the data model so that all stored objects (with a ZID, i.e. a page in the namespace) are of type Z2 - Persistent ZObject. This is assumed by the editing UI, so it was very confused to see a Z6 instead of a Z2 as the stored object. A Z6 (string) should be stored as a Z2 with key Z2K2 being the value (the string itself). So yes, probably best to just delete those two for now.

Wed, Oct 7, 1:24 PM · Abstract Wikipedia (Phase β)

Wed, Sep 30

ArthurPSmith added a comment to T258904: Special:CreateZObject.

Just a thought here - I would guess the main difference from the generic ZObject editor for the "Create" process is that there is no page ID yet. Presumably we want to generate the id's automatically - Z3834 followed by Z3835, Z3836, etc. Is there a standard approach to do that reliably? I know Wikidata's had issues with id values for their items (Qxxx...) getting skipped, but at least there doesn't seem to have been any issue with the same ID being generated twice in create. Anyway, whether or not that specific solution is adopted it sounds like that's an extra piece that's different from how normal wiki pages work generally...

Wed, Sep 30, 11:51 PM · Patch-For-Review, Abstract Wikipedia (Phase β)

Wed, Sep 23

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

@santhosh there are a few places where English words are used that should be i18n messages - do you know how this should work with Vue in Mediawiki? Is there a good example out there now?

Wed, Sep 23, 2:32 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Latest patchset (19) removes the Object.entries and destructuring bit; really this wasn't needed and if we're sticking with ES5 then it should go.

Wed, Sep 23, 2:06 PM · Abstract Wikipedia (Phase β)

Sep 23 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

and another patchset to fix eslint complaints...

Sep 23 2020, 1:33 AM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Patchset 15 fixes the issue with creating new ZObjects! However, it also unintentionally altered the package-lock.json file; I'm not quite sure what happened to change that...

Sep 23 2020, 1:03 AM · Abstract Wikipedia (Phase β)

Sep 22 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Patchset 11: call makeEmptyContent if ZObject is new

Sep 22 2020, 5:35 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

And patchset 10 - this was using an API to fetch the labels of keys in the old implementation; obviously that's not available (yet) here, and we might prefer a different solution anyway. So I stripped that out.

Sep 22 2020, 3:19 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Re-pull from gerrit worked (I think). I've uploaded patchset 9 - basically I've abandoned the special handling for labels as too much has changed there from the old implementation, and it all sort of works (more crudely) with the OtherKeys implementation anyway. No language drop-down any more though... I would have had to completely rewrite those two components.
However, I'm thinking the longer term solution here is to write special components like those for each ZObject type, which will allow special things like the language handling. I sort of have special handling now for lists (Z10) and strings (Z6), though only because those are built in to the representation.

Sep 22 2020, 3:11 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Hmm, thanks for fixing those things. However, now I'm not sure how to submit my further changes - I get the following message after doing a merge with your changes and then trying 'git review -R':

Sep 22 2020, 2:48 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

2 more patches to fix up ES5 vs ES6 issues (mostly const/let -> var) in the javascript, and also to address npm test and SonarQube complaints.

Sep 22 2020, 1:24 PM · Abstract Wikipedia (Phase β)

Sep 18 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

And just added another patch to fix the types and labels list (temporarily - these should really come from the data itself somehow).

Sep 18 2020, 2:16 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Not sure what's going on with the language stuff, but it sounds like it might be a local problem on my end. I'll probably reload everything at some point and see if it goes away.

Sep 18 2020, 1:32 PM · Abstract Wikipedia (Phase β)

Sep 17 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

Yeah, this is a choice I made for now to make things simpler in the model handling, but also to be closer to how MediaWiki does things. The display values and accepted short-codes are available from LanguageNameUtils, thusly:

Sep 17 2020, 1:35 PM · Abstract Wikipedia (Phase β)

Sep 16 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

very rough version (it displays stuff but I don't think any edit functionality is working yet) in gerrit - in progress! https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikiLambda/+/627887
:

Sep 16 2020, 5:35 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T263000: Have a type for the language on MonoLingualString.

For a reference on the (exceptionally large) number of ways that Wikidata is currently handling languages, see Lea's draft table here:
https://www.wikidata.org/wiki/User:Lea_Lacroix_(WMDE)/List_of_lists_of_languages

Sep 16 2020, 1:11 PM · Abstract Wikipedia (Phase δ)

Sep 15 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

This is moving along, but I'm noticing that currently there are some significant differences - in particular in language handling and I'm not sure how to proceed. The old 'abstracttext' had a list of languages that were themselves ZObjects - of type Z180 (language). This allowed having both the short code ('en') and a full label ('English', 'Anglais', etc.) in whatever the chosen current language was. I guess for now I'll just use and display the short codes and allow entry of whatever code they want rather than having a drop-down of languages, but maybe longer term we would want a different choice here?

Sep 15 2020, 3:22 PM · Abstract Wikipedia (Phase β)

Sep 9 2020

ArthurPSmith added a comment to T262447: Initial Vue implementation for ZObject editing.

From the standup meeting today, I'm going to try working on this over the next week or two. However, I had a basic question on how to proceed. I'll do it on its own branch for now, so merging shouldn't be an issue right away. The question: in google/abstracttext I implemented this as a separate "new edit" button, but I think it would be better to just replace the regular "edit" page for the ZObject namespace. Any strong opinions on this? This is going to be pretty experimental to start with, so of course we can change things drastically here!

Sep 9 2020, 5:33 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a subtask for T258903: Generic editor for ZObjects: T262447: Initial Vue implementation for ZObject editing.
Sep 9 2020, 5:27 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a parent task for T262447: Initial Vue implementation for ZObject editing: T258903: Generic editor for ZObjects.
Sep 9 2020, 5:27 PM · Abstract Wikipedia (Phase β)
ArthurPSmith created T262447: Initial Vue implementation for ZObject editing.
Sep 9 2020, 5:25 PM · Abstract Wikipedia (Phase β)

Sep 8 2020

ArthurPSmith added a comment to T260315: Create a file containing the core types to upload to wiki.

We chose to not risk split-brain issues with the title of the object stored in both MW's metadata and also inside the object in the Z2K1 key, so we splice in the value ...

But that value is kind of critical as an identifier within and between ZObjects - the prefix for related keys for types at least, and of course the value of any references. It means they don't make sense without the MW metadata.

Sep 8 2020, 1:06 PM · Abstract Wikipedia (Phase β)

Sep 7 2020

ArthurPSmith added a comment to T260315: Create a file containing the core types to upload to wiki.
Sep 7 2020, 2:42 PM · Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T260315: Create a file containing the core types to upload to wiki.

I was thinking about this - if we want a single file, why not make it a ZObject itself, i.e. a JSON formatted list (Z10) of these Persistent ZObjects? It will need some sort of script like the main MediaWiki maintenance/importTextFiles.php to run I assume...

Sep 7 2020, 2:10 PM · Abstract Wikipedia (Phase β)

Sep 4 2020

ArthurPSmith added a comment to T260314: Hardcode certain inalienable truths.

This sounds like protection levels? I.e. some things editable by anyone, some only by confirmed users, some only by admins, or other levels. However, you want it granular so that labels can be at a different protection level from other keys. Wikidata has desired something similar, to allow granular protection of some statements while allowing others within the same item to be editable by anyone. Anyway, would using standard Mediawiki protection procedures work for this for now?

Sep 4 2020, 5:54 PM · Abstract Wikipedia (Phase γ)
ArthurPSmith added a comment to T258904: Special:CreateZObject.

Is this the right design page: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Early_mockups#Create_an_object ?
I guess my question is whether the create page should be separated from the edit page (Special:CreateZObject as outlined in the approach here) or just go to a generic edit page which can do everything... ??

Sep 4 2020, 4:17 PM · Patch-For-Review, Abstract Wikipedia (Phase β)
ArthurPSmith added a comment to T258904: Special:CreateZObject.

I notice this is (so far) being implemented as a plain form with ooui - I think this is fine, though it limits what you can initially create. But Wikidata's create item is similarly limited to just the label, description and aliases in a given language. So I would suggest this should also be limited to the Z1 keys (type, label and description in a single language), with the id (Z1K2) auto-generated. Maybe I should check out the "designs" - I'm assuming the metawiki pages? (Edit: I just realized that the current design has Z2 - persistent Zobject - replacing most of the Z1 keys, so the id and labels come from Z2 now - anyway the same point stands that the initial creation of the object maybe could only use these required keys for a persistent ZObject.)

Sep 4 2020, 4:05 PM · Patch-For-Review, Abstract Wikipedia (Phase β)

Jul 30 2020

ArthurPSmith added a member for Abstract Wikipedia: ArthurPSmith.
Jul 30 2020, 2:53 PM

Jul 21 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

Something seems to be going on very recently that's a different pattern - did something change on the infrastructure side, or is there a change in usage pattern for the last few hours? Basically maxlag (WDQS lag specifically) has NOT gone below 5 (5 minutes for WDQS) for more than 1 hour. This hasn't happened, as far as I can tell, for many days, perhaps weeks or months. Typically maxlag recovers when bots stop editing after about 20-30 minutes, sometimes it takes almost an hour, but this is the longest delay in a long time. Specifically around 2020-07-21 14:04 the lag went over 5, and as of 15:18 it's grown to over 16.
(editing) finally at 15:35 it's coming down again (down to about 12 now).

Jul 21 2020, 3:19 PM · Wikidata-Campsite, Traffic, Performance Issue, Operations, Discovery, Wikidata-Query-Service, Wikidata

Jul 20 2020

ArthurPSmith added a comment to T242081: Pywikibot fails to access Wikidata due to high maxlag lately.

The purpose of checking maxlag is to slow the rate of EDITS to Wikidata. I don't understand why Pywikibot is using it as a reason not to READ data. There are surely a vast number of other applications out there that read from Wikidata (and query WDQS) without checking maxlag!

Jul 20 2020, 7:58 PM · Patch-For-Review, Upstream, Pywikibot, Pywikibot-tests

Jun 23 2020

ArthurPSmith added a comment to T255587: Vue should not use loadHTML.

Ah, I just noted on T253334 - I don't think RemexHtml is the right solution either - Vue templates also are not really html, as they include "elements" not in the HTML standard, and parsers may not handle them correctly. I ran into this just now using wmf/1.35.0-wmf.38 which has the RemexHtml parser, where I have an html table that has some of its rows provided by another Vue component:

Jun 23 2020, 8:51 PM · MW-1.35-notes (1.35.0-wmf.38; 2020-06-23), Vue.js, Performance-Team, MediaWiki-ResourceLoader
ArthurPSmith added a comment to T253334: Vue ResourceLoader support: JavaScript in script elements parsed as HTML leading to parsing error.

I don't think this problem was resolved correctly. What looks like HTML in templates is ALSO not really HTML. In particular, the current ResourceLoader does not handle <table>'s correctly when there is an internal component in the table, something like:

<table><tbody> <tr><th>header...</th></tr> <internal-tr-component ...></internal-tr-component> </tbody></table>

The current parsing is pulling out the "internal-tr-component" as a separate element outside of the table. This is wrong - templates should be left alone! I think an XML parser that doesn't understand HTML at all might be best for this?

Jun 23 2020, 8:41 PM · Performance-Team (Radar), MediaWiki-ResourceLoader, Vue.js

Jun 2 2020

ArthurPSmith committed rTPTAB48de3f961cc5: Fix for changed behavior of iter() and dicts (ordering is based on insertion… (authored by ArthurPSmith).
Fix for changed behavior of iter() and dicts (ordering is based on insertion…
Jun 2 2020, 11:21 PM
ArthurPSmith added a member for Tool-Wikidata-Periodic-Table: ArthurPSmith.
Jun 2 2020, 12:12 PM

Apr 8 2020

ArthurPSmith added a comment to T249687: gadget to add external ID as reference.

Thanks for creating this! I'm not sure what the standard citation reference for an external ID is, but what I've been using is:

  • stated in (P248) the value of "subject item of this property" (P1629) for that external ID property, if any
  • external ID property with value from the item
  • retrieved (P813) on the current date.

So it would be nice if this gadget could add these three (or 2 if no P1629 value) statements as a reference with a simple interaction...

Apr 8 2020, 4:12 PM · Wikidata-Gadgets, patch-welcome, Wikidata

Mar 18 2020

ArthurPSmith placed T112140: Provide a wrapper function in pywikibot around wbparsevalue up for grabs.

Unassigning, I'm not working on this any more!

Mar 18 2020, 4:13 PM · Patch-Needs-Improvement, Pywikibot, Pywikibot-Wikidata
ArthurPSmith closed T160205: Add interstitial to wikidata-externalid-url as Declined.

Wow, was that really almost 3 years ago. There doesn't seem to really be a need for this, so I'm closing the request as declined.

Mar 18 2020, 4:09 PM · Tools, Wikidata
ArthurPSmith closed T160205: Add interstitial to wikidata-externalid-url, a subtask of T150939: Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls, as Declined.
Mar 18 2020, 4:09 PM · MediaWiki-extensions-WikibaseRepository, Wikidata

Feb 14 2020

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

Sorry I never got around to looking at this further. @DD063520 do you understand the above comment from @thiemowmde about using the wbparsevalue api rather than python internals?

Feb 14 2020, 1:53 PM · Pywikibot, Pywikibot-Wikidata, Wikidata

Feb 12 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

@Bugreporter

I think increase the factor will not make thing better, it only increase the oscillating period

Feb 12 2020, 10:01 PM · Wikidata-Campsite, Traffic, Performance Issue, Operations, Discovery, Wikidata-Query-Service, Wikidata

Feb 11 2020

ArthurPSmith added a comment to T238045: Improve parallelism in WDQS updater.

Possibly relevant comment here: I believe there is a plan also to move to incremental updates (updating only the statements/triples that have changed) so it is probably important that any parallelism in updating be coordinated so that updates for the same item (Q value) be grouped together and done in the same process, so they don't clobber one another. Updates for separate items (different Q values) can be handled in parallel as the associated RDF triples are independent (the subject of a triple is always the item, a statement on the item, or a further node derived from the item). Even without that incremental update process, grouping updates on the same item together could be a significant speed boost, as 5 updates for Q9999 can be collapsed into just the last update under the current procedure of completely rewriting the triples.

Feb 11 2020, 7:45 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Feb 7 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

Over the past weeks, we noticed a huge increase of content in Wikidata. Maybe that's something worth looking at?

Wikidata content is growing at a fast and steady pace and has been for a few years now. For the last few months it's been expanding at a rate of around 3,500,000 new pages per month. So that seems unlikely to be connected.

Feb 7 2020, 1:34 AM · Wikidata-Campsite, Traffic, Performance Issue, Operations, Discovery, Wikidata-Query-Service, Wikidata

Feb 4 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

@Addshore and others - the problem has deteriorated since Saturday - see this discussion on Wikidata: https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search#WDQS_lag

Feb 4 2020, 4:47 PM · Wikidata-Campsite, Traffic, Performance Issue, Operations, Discovery, Wikidata-Query-Service, Wikidata

Jan 19 2020

ArthurPSmith added a comment to T221774: Add Wikidata query service lag to Wikidata maxlag.

[...]

Note that this dashboard includes metrics for both pooled and depooled servers.
So whatever you read there will likely also be reporting data for servers that you can't actually query thus are not seeing the lag for via the query service

Jan 19 2020, 9:28 PM · MW-1.35-notes (1.35.0-wmf.31; 2020-05-05), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Patch-For-Review, observability, Wikidata-Query-Service, Wikidata

Jan 18 2020

ArthurPSmith added a comment to T221774: Add Wikidata query service lag to Wikidata maxlag.

@Bugreporter well something must have changed early today - was it previously "mean" and is now "median"? I'm not sure which is better, but having WDQS hours out of date (we're over 4 hours now) is NOT a good situation, and what this whole task was intended to avoid! @Pintoch any thoughts on this?

Jan 18 2020, 3:05 AM · MW-1.35-notes (1.35.0-wmf.31; 2020-05-05), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Patch-For-Review, observability, Wikidata-Query-Service, Wikidata

Jan 17 2020

ArthurPSmith added a comment to T240442: Design a continuous throttling policy for Wikidata bots.

Just saw this - I'm wondering technically how you would implement it? You could generate a random number between 2.5 and 5, and if maxlag is greater than your random number deny the edit?

Jan 17 2020, 8:13 PM · Wikidata
ArthurPSmith added a comment to T221774: Add Wikidata query service lag to Wikidata maxlag.

Am I misreading this graph? https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&fullscreen&orgId=1&from=now-12h&to=now&refresh=10s It looks like the query service lag for 3 of the servers has been growing steadily for the past roughly 8 hours. However, edits are going through. Did something change in the maxlag logic somewhere earlier today?

Jan 17 2020, 8:09 PM · MW-1.35-notes (1.35.0-wmf.31; 2020-05-05), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Patch-For-Review, observability, Wikidata-Query-Service, Wikidata

Dec 11 2019

ArthurPSmith closed T240371: Maxlag=5 for Author Disambiguator, a subtask of T240369: Chase up bot operators whose bot keeps running when the dispatch lag is higher than 5, as Resolved.
Dec 11 2019, 3:36 PM · Wikidata
ArthurPSmith closed T240371: Maxlag=5 for Author Disambiguator as Resolved.

Marking as resolved...

Dec 11 2019, 3:36 PM · Wikidata
ArthurPSmith added a comment to T240371: Maxlag=5 for Author Disambiguator.

I increased the default number of retries to 12, so it will now retry for up to an hour. I think we're good here?

Dec 11 2019, 2:55 PM · Wikidata
ArthurPSmith added a comment to T240371: Maxlag=5 for Author Disambiguator.

(A) Pintoch's patch has been applied, and (B) I also increased the retry time from 5 seconds to 5 minutes - that still means an edit will fail after 25 minutes if maxlag doesn't drop, with only 5 retries. Is there a consensus to retry for an hour? Or if there's a better standard for handling retries let me know!

Dec 11 2019, 2:50 PM · Wikidata

Oct 22 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

Here's a draft of slides for our workshop. Please feel free to edit this. Also I think you wanted to cover a bit more basics of how lexemes are put together - maybe that should go first? This was mostly just gathering statistics and data and then a page of questions at the end...

Oct 22 2019, 1:46 PM · WMSE-FindingGLAMs-2018 (Events), User-Alicia_Fagerving_WMSE

Oct 21 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

I am uploading files with data on the counts of forms and senses by date for the last year (also totals in last column). There may have been some issues with this - it comes from the Lexicographical statistics pages that are generated from WDQS queries, so there were a few periods where I think the numbers were off. Anyway, it should be close to correct for most of the time period. So we can plot this along a bit of a timeline for the last year I think?

Oct 21 2019, 2:22 PM · WMSE-FindingGLAMs-2018 (Events), User-Alicia_Fagerving_WMSE

Oct 18 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

I just did some exploring but I don't think Quarry will help with forms and senses - at least they're not "pages" in themselves with their own namespace. Actually I couldn't figure out where they were in the database schema at all... Anyway, I think I can get some rough numbers from looking at the stats page as it has changed over time, I will work on this.

Oct 18 2019, 6:55 PM · WMSE-FindingGLAMs-2018 (Events), User-Alicia_Fagerving_WMSE

Oct 11 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

I was thinking some graphics on the growth of lexemes, forms, and senses would be good - do we already have that somewhere?

Oct 11 2019, 6:50 PM · WMSE-FindingGLAMs-2018 (Events), User-Alicia_Fagerving_WMSE

Sep 26 2019

ArthurPSmith added a comment to T233763: Error searching Lexeme:Danke on Wikidata: "Call to a member function setFragment() on null".

If you go to the search page and select "Lexeme" as the only namespace you get the same error with "thanks" in the search box, but "thank" alone works fine - the two lexemes that match are L3798 (verb) and L28468 (noun).

Sep 26 2019, 7:21 PM · MW-1.35-notes (1.35.0-wmf.1; 2019-10-08), Wikidata Lexicographical data, Discovery-Search, Wikimedia-production-error, Wikidata

Sep 12 2019

ArthurPSmith added a comment to T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites.

The Basque collection is even more complete now!
I do think some customization may be needed for Lexemes due to the different structure - the forms and senses etc. Perhaps the most useful link for a wiktionary may be from words to senses to wikidata items via the "item for this sense" property. That in principle allows translations to be provided, grouped by sense.

Sep 12 2019, 5:21 PM · Wiktionary, Wikidata, Wikidata Lexicographical data

Aug 1 2019

ArthurPSmith added a comment to T229604: Several selectors/experts are broken.

I see the problem also (Safari browser). When you talk about it affecting lexemes, where do you see that? I experimented with adding a form and that seemed fine.

Aug 1 2019, 6:35 PM · MW-1.34-notes (1.34.0-wmf.16; 2019-07-30), User-Ladsgroup, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), Regression, Wikidata

Feb 18 2019

ArthurPSmith added a comment to T216208: ToolsDB overload and cleanup.

I can give a guess-estimate. Given the complexity of some of the operations we are doing (specially to prevent serious data-loss), services probably won't be fully recovered until at least Tuesday next week (2019-02-26).

Feb 18 2019, 3:37 PM · archived--TCB-Team, Phragile, Data-Services, cloud-services-team (Kanban)

Jan 28 2019

ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

Can you add a test to the statement ID generation code that ensures it has an RDF compatible format (except for the 1 character that's a problem now), and a note that this is required for RDF support?

Jan 28 2019, 10:48 PM · Wikidata-Query-Service, Wikidata
ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

promise it will always be one-to-one, no matter what happens with internal IDs

Jan 28 2019, 8:20 PM · Wikidata-Query-Service, Wikidata

Jan 26 2019

ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

Another thought - even better would be if the API could be adjusted so it accepts the WDQS statement ID format as it is (all -'s).

Jan 26 2019, 5:21 PM · Wikidata-Query-Service, Wikidata
ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

Thanks for creating this ticket! Actually, my use case is the opposite of Lucas's - I want to be able to go from the results of a WDQS query to fetch the full statement via the API, which requires the statement ID. So I would like to see the id conversion documented in BOTH directions - and in particular the arbitrary regex replace listed above (preg_replace( '/[^\w-]/', '-', $statementID )) would NOT work for that purpose. Rather can we just settle that the first $ or - is switched, and that's it? Or is there something else that's an issue here?

Jan 26 2019, 5:20 PM · Wikidata-Query-Service, Wikidata

Jan 7 2019

ArthurPSmith added a comment to T76232: [Story] nudge when editing a statement to check reference.

I didn't know about the "award token" option!

Jan 7 2019, 3:13 PM · Story, Wikidata, MediaWiki-extensions-WikibaseRepository
ArthurPSmith awarded T76232: [Story] nudge when editing a statement to check reference a Like token.
Jan 7 2019, 3:00 PM · Story, Wikidata, MediaWiki-extensions-WikibaseRepository

Nov 28 2018

ArthurPSmith added a comment to T210495: Number of Senses is decreasing on ListeriaBot's report.

Just a note - WDQS query gives different results hopping up and down - sometimes 3004 (for English lexeme senses) and sometimes 2872, over about the last 10 minutes.

Nov 28 2018, 7:16 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata
ArthurPSmith updated subscribers of T210495: Number of Senses is decreasing on ListeriaBot's report.

@Smalyshev I'd forgotten there was a phabricator ticket for this - anyway, this is what I was referring to... Last night's update bumped the number down again to 2718; however when I run the query directly on WDQS I get 3004 right now. Something's not right!

Nov 28 2018, 6:57 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata

Nov 27 2018

ArthurPSmith added a comment to T210495: Number of Senses is decreasing on ListeriaBot's report.

I ran a manual update and the total for English bumped up to 2819 - so it doesn't look as if we've actually lost lexeme senses, just that some of the query servers don't know about all of them?

Nov 27 2018, 6:42 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata
ArthurPSmith added a comment to T210495: Number of Senses is decreasing on ListeriaBot's report.

I wouldn't be surprised if it's a WDQS problem, this is definitely generated from an RDF query.

Nov 27 2018, 6:39 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata

Oct 16 2018

ArthurPSmith added a comment to T160259: [Story] RDF for Lexemes, Forms and Senses.

According to https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/RDF_mapping a lexeme should be "a wikibase:Lexeme " as well as "a ontolex:LexicalEntry", but in the query service I can only find things via the latter relation. Similarly for forms and "wikibase:Form". Something left out of the dump?

Oct 16 2018, 3:09 PM · Wikidata Lexicographical data, Wikidata

Jun 29 2018

ArthurPSmith added a comment to T197145: Create special pages for lexemes.

WDQS works for me! I'm not sure where that is of course - I guess I could check Phabricator!

Jun 29 2018, 6:32 PM · Wikidata Lexicographical data, Wikidata

Jun 19 2018

ArthurPSmith added a comment to T197145: Create special pages for lexemes.

Does "alphabetical" ordering even make sense for words in a collection of vastly different writing systems? If this is done I would recommend it be accompanied by some filtering - for language, part of speech, grammatical features, certain properties perhaps.

Jun 19 2018, 5:59 PM · Wikidata Lexicographical data, Wikidata
ArthurPSmith committed rTPTAB4dfa171976e5: Fix for problem with new time unit being used for half lives - rely on wikidata… (authored by ArthurPSmith).
Fix for problem with new time unit being used for half lives - rely on wikidata…
Jun 19 2018, 5:19 PM

Jun 1 2018

ArthurPSmith added a comment to T195740: Decide on a way forward for acceptable languages for lemmas and representations.

I am in general favorable to Micru's proposal, and perhaps Pamputt's elaboration of it above: using wikidata items directly allows representation of the lemma language naturally in the user's own script/language for one, and other automatic bonuses of using items given the structured data ethos etc.. However I'm a little confused about the details of how this would work - specifically, the most commonly used lexemes would usually have the same spelling, use etc. across all variants of a language; do we give that a more general language ("en" = Q1860 say) and only use the specific items mentioned ("en-US" = Q7976, "en-GB" = Q7979, "en-CA" = Q44676, etc.) where there really are variations? Or would it be possible to attach multiple language items to a single lexeme, to indicate it applies to several specific variants?

Jun 1 2018, 3:15 PM · Wikidata, Wikidata Lexicographical data

May 29 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

Here's a specific question that might be detailed enough in description: suppose we have a collection of facts (say the names, countries, inception dates, and official websites for a collection of organizations) that has been extracted from multiple sources, including various language wikipedias, a CC-0 data source (for example https://grid.ac/) and a non-CC-0 non-wikipedia data source - these sources would be indicated in wikidata by the reference/source section on each statement. This extraction has been done by users either manually or running bots with the understanding that they are adding facts to a CC-0 database (wikidata). Reconciling the facts - for example merging duplicates with slightly different names, dates, or URL's - has been done by users manually or semi-automatically, again with the understanding they are contributing to a CC-0 database. Are there any copyright or other rights constraints that apply to this collection, or can it be fully considered to legally be CC-0?

May 29 2018, 3:55 PM · WMF-Legal, Wikidata
ArthurPSmith added a comment to T163642: Index Wikidata strings in statements for fulltext search.

Hmm, I'm not sure this is all that useful at least as it stands. Most external id's can be as easily found now via the Wikidata Resolver tool - https://tools.wmflabs.org/wikidata-todo/resolver.php - However, what I would find useful would be a way to locate for example partial street addresses - this (P969) is often entered as a qualifier on headquarters location (P159). Searching for' haswbstatement:P969=Main' now finds something, but only because that oddly has just 'Main' as the value for P969, and making the string lowercase ("main") finds nothing, which is definitely not what I would expect on this... I don't think treating string values as if they were identifiers is the right approach, the usefulness of a search engine is in normalizing string values so you can find them without having the exact matching string. And qualifiers should be folded in somehow!

May 29 2018, 2:52 PM · MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), Discovery-Search (Current work), User-Smalyshev, User-aude, CirrusSearch, Discovery, Wikidata

May 28 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

Hi - my most recent response was following MisterSynergy's comment on Denny's proposed questions, and specifically the meaning of "processes that in bulk extract facts from Wikipedia articles," - it sounds like from subsequent discussion that we are not talking solely of automated "processes", so I think I echo MisterSynergy's comment that the question needs to be better defined to "describe how these processes look like". On the one hand there's overall averages, with less than one "fact" per wikipedia article; on the other hand the distribution is probably quite wide, with some articles having dozens of "facts" extracted from them. Since CC-BY-SA applies to each article individually, does extraction of too much factual data from one article potentially violate its copyright?

May 28 2018, 2:30 PM · WMF-Legal, Wikidata

May 26 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

based on the fact that we have ~42M “imported from” references and ~64M sitelinks in Wikidata

May 26 2018, 7:07 PM · WMF-Legal, Wikidata

May 25 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

Some references on why CC0 is essential for a free public database:
https://wiki.creativecommons.org/wiki/CC0_use_for_data
"Databases may contain facts that, in and of themselves, are not protected by copyright law. However, the copyright laws of many jurisdictions cover creatively selected or arranged compilations of facts and creative database design and structure, and some jurisdictions like those in the European Union have enacted additional sui generis laws that restrict uses of databases without regard for applicable copyright law. CC0 is intended to cover all copyright and database rights, so that however data and databases are restricted (under copyright or otherwise), those rights are all surrendered"

May 25 2018, 5:47 PM · WMF-Legal, Wikidata

May 23 2018

ArthurPSmith added a comment to T195382: show Lemma on Special:AllPages.

FYI I agree with VIGNERON on what it should look like - but at least something more than the id!

May 23 2018, 3:44 PM · MW-1.32-notes (WMF-deploy-2018-06-12 (1.32.0-wmf.8)), Wikidata-Editor-Experience-Improvements-Iteration0, Wikidata-Turtles-Sprint #5, Patch-For-Review, Wikidata, Wikidata Lexicographical data

May 22 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

It has been asserted here several times that OSM data has been wholesale imported into Wikidata - do we know that has happened? Wikidata has two properties related to OSM, one that relates wikidata items to OSM tags like "lighthouse", and one that is essentially deprecated (see T145284), so I assume those are not the issue. According to https://www.wikidata.org/wiki/Wikidata:OpenStreetMap (text which has been there since at least last September) "it is not possible to import coordinates from OpenStreetMap to Wikidata". If the issue is coordinates imported via wikipedia infoboxes that originated with OSM, I can see there might be an issue there, and maybe that should be added to Denny's suggested question in some fashion. But as far as actual importing of OSM data, the only specific cases that I noticed explicitly cited above are (A) a bot request that has been rejected, and (B) a discussion from 2013 where the copyright issue was explicitly raised right away.

May 22 2018, 6:59 PM · WMF-Legal, Wikidata

Oct 11 2017

ArthurPSmith committed rTPTAB4c9cb15dbd2e: Group 1 ID has changed due to a merge (authored by ArthurPSmith).
Group 1 ID has changed due to a merge
Oct 11 2017, 3:29 PM

Jul 21 2017

ArthurPSmith added a comment to T171092: WDQS sync (?) issue for certain recently created items.

Of course, now these examples I gave are working - probably because I updated them recently. However, I found more that are not now, or only partially - for example Q2256713:

Jul 21 2017, 3:10 PM · TestMe, Wikidata-Query-Service, Discovery, Wikidata

Jul 19 2017

ArthurPSmith created T171092: WDQS sync (?) issue for certain recently created items.
Jul 19 2017, 7:09 PM · TestMe, Wikidata-Query-Service, Discovery, Wikidata

Jul 14 2017

ArthurPSmith raised the priority of T54564: Allow sitelinks to redirect pages to fix the 'Bonnie and Clyde problem' from Lowest to Medium.

I don't understand why Multichill can unilaterally alter the priority on this request in the face of an active wikidata RFC where the voting has been 2:1 in support of this change. It would also be nice to get some actual feedback from developers - is this really "against the core data model of Wikdiata"? I don't see it - particularly as the workarounds in place now prove it can be easily supported.

Jul 14 2017, 2:46 PM · Wikidata, MediaWiki-extensions-WikibaseRepository

Jul 13 2017

ArthurPSmith added a comment to T170614: constraint gadget always shows an error for P279 (subclass of) statements.

Thanks! I did search through the open tasks first and didn't find anything on this....

Jul 13 2017, 7:07 PM · Wikidata, Wikibase-Quality-Constraints
ArthurPSmith created T170614: constraint gadget always shows an error for P279 (subclass of) statements.
Jul 13 2017, 6:09 PM · Wikidata, Wikibase-Quality-Constraints

Jun 6 2017

ArthurPSmith added a comment to T143486: [feature request] remove sitelinks / update sitelinks on Wikidata when pages are deleted/moved on client wikis (all users).

The dummy user solution sounds good to me. Magnus Manske is doing something like this with his QuickStatementsBot so maybe a special purpose Bot account on wikidata for this?

Jun 6 2017, 1:54 PM · Wikidata

Mar 23 2017

ArthurPSmith added a comment to T150939: Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls.

I believe a way this could be done would be to allow the attachment of regular expressions to the formatter URL, and have the external id URL conversion code understand them. That is, if there was a qualifier property that specified "regex substitution" for example, the ISNI problem (of additional spaces within the id that must be removed for the formatter URL) would be handled by a value something like "s/\s+//g" (remove all spaces). Some of the others might need a "regex match" on the id that allows specifying a $1, $2, $3 grouping pattern, and the formatter URL then looks something like http://...../$1/$2/$3 (or that could also possibly be handled by a substitution as in the ISNI case). The IMDB case is more difficult because it's essential 4 different formatter URLs based on the first characters of the id, so it might need a "regex filter" that limits the scope of each formatter URL based on the id; wikibase would then need to look through the filter regexes to find a matching formatter URL and use that.

Mar 23 2017, 3:37 PM · MediaWiki-extensions-WikibaseRepository, Wikidata

Mar 22 2017

ArthurPSmith added a comment to T150939: Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls.

As background, I'm seeing about 2000 "hits" per day on this service right now, with about a dozen properties linking through it to their databases.

Mar 22 2017, 8:23 PM · MediaWiki-extensions-WikibaseRepository, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

@Esc3300 well, I developed this tool because links for IMDB and a handful of other properties were broken when we made the change from string to "external identifier" last year, where the wikidata UI started putting the links in directly (previously it had been done by a javascript gadget - which meant the links wouldn't be available to re-users either). So "work without this tool" would break a lot of stuff in wikidata and for everybody using it.

Mar 22 2017, 8:16 PM · Tools, Wikidata

Mar 21 2017

ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

Hmm, Ok, I read through the discussion you linked with @coren - I certainly see there can be a privacy violation regarding expectations in cases as were discussed there. I think this is a quite different case though (for example, the links are exclusively to third-party sites, not anything I or any other WMF person controls) and would like to hear directly from somebody with WMF (and some voices from wikidata) on this. If there is a clearly posted policy somewhere that would be great too. The policy linked by @coren focused on the Labs user collecting personal information, which is not at all happening here, and said nothing specifically about redirects per se.

Mar 21 2017, 9:08 PM · Tools, Wikidata
ArthurPSmith claimed T160205: Add interstitial to wikidata-externalid-url.

(claiming task - if this really needs to be done I can certainly take care of it)

Mar 21 2017, 8:50 PM · Tools, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

Hmm, I think the big issue may be point 3. Do you have an example where this might have come up? I could certainly make it an interstitial easily enough, but that makes these links a bit less convenient for people (extra click); if the links are being included with or without a warning elsewhere based on the wmflabs URL then I can see how it may be important to address this somehow. Also is there boilerplate text we should use if we really do need to put this in?

Mar 21 2017, 8:48 PM · Tools, Wikidata