Page MenuHomePhabricator
Feed Advanced Search

Jun 23 2020

ArthurPSmith added a comment to T255587: Vue should not use loadHTML.

Ah, I just noted on T253334 - I don't think RemexHtml is the right solution either - Vue templates also are not really html, as they include "elements" not in the HTML standard, and parsers may not handle them correctly. I ran into this just now using wmf/1.35.0-wmf.38 which has the RemexHtml parser, where I have an html table that has some of its rows provided by another Vue component:

Jun 23 2020, 8:51 PM · MW-1.35-notes (1.35.0-wmf.38; 2020-06-23), Design-Systems-team-20200324-20220422, MediaWiki-ResourceLoader, Performance-Team
ArthurPSmith added a comment to T253334: Vue ResourceLoader support: JavaScript in script elements parsed as HTML leading to parsing error.

I don't think this problem was resolved correctly. What looks like HTML in templates is ALSO not really HTML. In particular, the current ResourceLoader does not handle <table>'s correctly when there is an internal component in the table, something like:

<table><tbody> <tr><th>header...</th></tr> <internal-tr-component ...></internal-tr-component> </tbody></table>

The current parsing is pulling out the "internal-tr-component" as a separate element outside of the table. This is wrong - templates should be left alone! I think an XML parser that doesn't understand HTML at all might be best for this?

Jun 23 2020, 8:41 PM · Performance-Team (Radar), MediaWiki-ResourceLoader, Design-Systems-team-20200324-20220422

Jun 2 2020

ArthurPSmith committed rTPTAB48de3f961cc5: Fix for changed behavior of iter() and dicts (ordering is based on insertion….
Fix for changed behavior of iter() and dicts (ordering is based on insertion…
Jun 2 2020, 11:21 PM
ArthurPSmith added a member for Tool-Wikidata-Periodic-Table: ArthurPSmith.
Jun 2 2020, 12:12 PM

Apr 8 2020

ArthurPSmith added a comment to T249687: gadget to add external ID as reference.

Thanks for creating this! I'm not sure what the standard citation reference for an external ID is, but what I've been using is:

  • stated in (P248) the value of "subject item of this property" (P1629) for that external ID property, if any
  • external ID property with value from the item
  • retrieved (P813) on the current date.

So it would be nice if this gadget could add these three (or 2 if no P1629 value) statements as a reference with a simple interaction...

Apr 8 2020, 4:12 PM · Wikidata-Gadgets, patch-welcome, Wikidata

Mar 18 2020

ArthurPSmith placed T112140: Provide a wrapper function in pywikibot around wbparsevalue up for grabs.

Unassigning, I'm not working on this any more!

Mar 18 2020, 4:13 PM · Pywikibot, Pywikibot-Wikidata
ArthurPSmith closed T160205: Add interstitial to wikidata-externalid-url as Declined.

Wow, was that really almost 3 years ago. There doesn't seem to really be a need for this, so I'm closing the request as declined.

Mar 18 2020, 4:09 PM · Tools, Wikidata
ArthurPSmith closed T160205: Add interstitial to wikidata-externalid-url, a subtask of T150939: Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls, as Declined.
Mar 18 2020, 4:09 PM · MediaWiki-extensions-WikibaseRepository, Wikidata

Feb 14 2020

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

Sorry I never got around to looking at this further. @DD063520 do you understand the above comment from @thiemowmde about using the wbparsevalue api rather than python internals?

Feb 14 2020, 1:53 PM · Pywikibot, Pywikibot-Wikidata, Wikidata

Feb 12 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

@Bugreporter

I think increase the factor will not make thing better, it only increase the oscillating period

Feb 12 2020, 10:01 PM · Wikidata-Campsite, Traffic, Performance Issue, SRE, Discovery-ARCHIVED, Wikidata-Query-Service, Wikidata

Feb 11 2020

ArthurPSmith added a comment to T238045: Improve parallelism in WDQS updater.

Possibly relevant comment here: I believe there is a plan also to move to incremental updates (updating only the statements/triples that have changed) so it is probably important that any parallelism in updating be coordinated so that updates for the same item (Q value) be grouped together and done in the same process, so they don't clobber one another. Updates for separate items (different Q values) can be handled in parallel as the associated RDF triples are independent (the subject of a triple is always the item, a statement on the item, or a further node derived from the item). Even without that incremental update process, grouping updates on the same item together could be a significant speed boost, as 5 updates for Q9999 can be collapsed into just the last update under the current procedure of completely rewriting the triples.

Feb 11 2020, 7:45 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Feb 7 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

Over the past weeks, we noticed a huge increase of content in Wikidata. Maybe that's something worth looking at?

Wikidata content is growing at a fast and steady pace and has been for a few years now. For the last few months it's been expanding at a rate of around 3,500,000 new pages per month. So that seems unlikely to be connected.

Feb 7 2020, 1:34 AM · Wikidata-Campsite, Traffic, Performance Issue, SRE, Discovery-ARCHIVED, Wikidata-Query-Service, Wikidata

Feb 4 2020

ArthurPSmith added a comment to T243701: Wikidata maxlag repeatedly over 5s since Jan 20, 2020 (primarily caused by the query service).

@Addshore and others - the problem has deteriorated since Saturday - see this discussion on Wikidata: https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search#WDQS_lag

Feb 4 2020, 4:47 PM · Wikidata-Campsite, Traffic, Performance Issue, SRE, Discovery-ARCHIVED, Wikidata-Query-Service, Wikidata

Jan 19 2020

ArthurPSmith added a comment to T221774: Add Wikidata query service lag to Wikidata maxlag.

[...]

Note that this dashboard includes metrics for both pooled and depooled servers.
So whatever you read there will likely also be reporting data for servers that you can't actually query thus are not seeing the lag for via the query service

Jan 19 2020, 9:28 PM · MW-1.35-notes (1.35.0-wmf.31; 2020-05-05), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Patch-For-Review, observability, Wikidata-Query-Service, Wikidata

Jan 18 2020

ArthurPSmith added a comment to T221774: Add Wikidata query service lag to Wikidata maxlag.

@Bugreporter well something must have changed early today - was it previously "mean" and is now "median"? I'm not sure which is better, but having WDQS hours out of date (we're over 4 hours now) is NOT a good situation, and what this whole task was intended to avoid! @Pintoch any thoughts on this?

Jan 18 2020, 3:05 AM · MW-1.35-notes (1.35.0-wmf.31; 2020-05-05), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Patch-For-Review, observability, Wikidata-Query-Service, Wikidata

Jan 17 2020

ArthurPSmith added a comment to T240442: Design a continuous throttling policy for Wikidata bots.

Just saw this - I'm wondering technically how you would implement it? You could generate a random number between 2.5 and 5, and if maxlag is greater than your random number deny the edit?

Jan 17 2020, 8:13 PM · Wikidata
ArthurPSmith added a comment to T221774: Add Wikidata query service lag to Wikidata maxlag.

Am I misreading this graph? https://grafana.wikimedia.org/d/000000489/wikidata-query-service?panelId=8&fullscreen&orgId=1&from=now-12h&to=now&refresh=10s It looks like the query service lag for 3 of the servers has been growing steadily for the past roughly 8 hours. However, edits are going through. Did something change in the maxlag logic somewhere earlier today?

Jan 17 2020, 8:09 PM · MW-1.35-notes (1.35.0-wmf.31; 2020-05-05), User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Patch-For-Review, observability, Wikidata-Query-Service, Wikidata

Dec 11 2019

ArthurPSmith closed T240371: Maxlag=5 for Author Disambiguator, a subtask of T240369: Chase up bot operators whose bot keeps running when the dispatch lag is higher than 5, as Resolved.
Dec 11 2019, 3:36 PM · User-Addshore, Wikidata
ArthurPSmith closed T240371: Maxlag=5 for Author Disambiguator as Resolved.

Marking as resolved...

Dec 11 2019, 3:36 PM · Wikidata
ArthurPSmith added a comment to T240371: Maxlag=5 for Author Disambiguator.

I increased the default number of retries to 12, so it will now retry for up to an hour. I think we're good here?

Dec 11 2019, 2:55 PM · Wikidata
ArthurPSmith added a comment to T240371: Maxlag=5 for Author Disambiguator.

(A) Pintoch's patch has been applied, and (B) I also increased the retry time from 5 seconds to 5 minutes - that still means an edit will fail after 25 minutes if maxlag doesn't drop, with only 5 retries. Is there a consensus to retry for an hour? Or if there's a better standard for handling retries let me know!

Dec 11 2019, 2:50 PM · Wikidata

Oct 22 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

Here's a draft of slides for our workshop. Please feel free to edit this. Also I think you wanted to cover a bit more basics of how lexemes are put together - maybe that should go first? This was mostly just gathering statistics and data and then a page of questions at the end...

Oct 22 2019, 1:46 PM · WMSE-FindingGLAMs-2018 (GLAM Events), User-Alicia_Fagerving_WMSE

Oct 21 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

I am uploading files with data on the counts of forms and senses by date for the last year (also totals in last column). There may have been some issues with this - it comes from the Lexicographical statistics pages that are generated from WDQS queries, so there were a few periods where I think the numbers were off. Anyway, it should be close to correct for most of the time period. So we can plot this along a bit of a timeline for the last year I think?

Oct 21 2019, 2:22 PM · WMSE-FindingGLAMs-2018 (GLAM Events), User-Alicia_Fagerving_WMSE

Oct 18 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

I just did some exploring but I don't think Quarry will help with forms and senses - at least they're not "pages" in themselves with their own namespace. Actually I couldn't figure out where they were in the database schema at all... Anyway, I think I can get some rough numbers from looking at the stats page as it has changed over time, I will work on this.

Oct 18 2019, 6:55 PM · WMSE-FindingGLAMs-2018 (GLAM Events), User-Alicia_Fagerving_WMSE

Oct 11 2019

ArthurPSmith added a comment to T235247: Prepare lexeme workshop at WikidataCon 2019.

I was thinking some graphics on the growth of lexemes, forms, and senses would be good - do we already have that somewhere?

Oct 11 2019, 6:50 PM · WMSE-FindingGLAMs-2018 (GLAM Events), User-Alicia_Fagerving_WMSE

Sep 26 2019

ArthurPSmith added a comment to T233763: Error searching Lexeme:Danke on Wikidata: "Call to a member function setFragment() on null".

If you go to the search page and select "Lexeme" as the only namespace you get the same error with "thanks" in the search box, but "thank" alone works fine - the two lexemes that match are L3798 (verb) and L28468 (noun).

Sep 26 2019, 7:21 PM · MW-1.35-notes (1.35.0-wmf.1; 2019-10-08), Wikidata Lexicographical data, Discovery-Search, Wikimedia-production-error, Wikidata

Sep 12 2019

ArthurPSmith added a comment to T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites.

The Basque collection is even more complete now!
I do think some customization may be needed for Lexemes due to the different structure - the forms and senses etc. Perhaps the most useful link for a wiktionary may be from words to senses to wikidata items via the "item for this sense" property. That in principle allows translations to be provided, grouped by sense.

Sep 12 2019, 5:21 PM · All-and-every-Wiktionary, Wikidata, Wikidata Lexicographical data

Aug 1 2019

ArthurPSmith added a comment to T229604: Several selectors/experts are broken.

I see the problem also (Safari browser). When you talk about it affecting lexemes, where do you see that? I experimented with adding a form and that seemed fine.

Aug 1 2019, 6:35 PM · MW-1.34-notes (1.34.0-wmf.16; 2019-07-30), User-Ladsgroup, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Regression, Wikidata

Feb 18 2019

ArthurPSmith added a comment to T216208: ToolsDB overload and cleanup.

I can give a guess-estimate. Given the complexity of some of the operations we are doing (specially to prevent serious data-loss), services probably won't be fully recovered until at least Tuesday next week (2019-02-26).

Feb 18 2019, 3:37 PM · TCB-Team (now WMDE-TechWish), Phragile, Data-Services, cloud-services-team (Kanban)

Jan 28 2019

ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

Can you add a test to the statement ID generation code that ensures it has an RDF compatible format (except for the 1 character that's a problem now), and a note that this is required for RDF support?

Jan 28 2019, 10:48 PM · Wikidata-Query-Service, Wikidata
ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

promise it will always be one-to-one, no matter what happens with internal IDs

Jan 28 2019, 8:20 PM · Wikidata-Query-Service, Wikidata

Jan 26 2019

ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

Another thought - even better would be if the API could be adjusted so it accepts the WDQS statement ID format as it is (all -'s).

Jan 26 2019, 5:21 PM · Wikidata-Query-Service, Wikidata
ArthurPSmith added a comment to T214680: Document statement URI format for RDF.

Thanks for creating this ticket! Actually, my use case is the opposite of Lucas's - I want to be able to go from the results of a WDQS query to fetch the full statement via the API, which requires the statement ID. So I would like to see the id conversion documented in BOTH directions - and in particular the arbitrary regex replace listed above (preg_replace( '/[^\w-]/', '-', $statementID )) would NOT work for that purpose. Rather can we just settle that the first $ or - is switched, and that's it? Or is there something else that's an issue here?

Jan 26 2019, 5:20 PM · Wikidata-Query-Service, Wikidata

Jan 7 2019

ArthurPSmith added a comment to T76232: [Story] nudge when editing a statement to check reference.

I didn't know about the "award token" option!

Jan 7 2019, 3:13 PM · Story, Wikidata, MediaWiki-extensions-WikibaseRepository
ArthurPSmith awarded T76232: [Story] nudge when editing a statement to check reference a Like token.
Jan 7 2019, 3:00 PM · Story, Wikidata, MediaWiki-extensions-WikibaseRepository

Nov 28 2018

ArthurPSmith added a comment to T210495: Number of Senses is decreasing on ListeriaBot's report.

Just a note - WDQS query gives different results hopping up and down - sometimes 3004 (for English lexeme senses) and sometimes 2872, over about the last 10 minutes.

Nov 28 2018, 7:16 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata
ArthurPSmith updated subscribers of T210495: Number of Senses is decreasing on ListeriaBot's report.

@Smalyshev I'd forgotten there was a phabricator ticket for this - anyway, this is what I was referring to... Last night's update bumped the number down again to 2718; however when I run the query directly on WDQS I get 3004 right now. Something's not right!

Nov 28 2018, 6:57 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata

Nov 27 2018

ArthurPSmith added a comment to T210495: Number of Senses is decreasing on ListeriaBot's report.

I ran a manual update and the total for English bumped up to 2819 - so it doesn't look as if we've actually lost lexeme senses, just that some of the query servers don't know about all of them?

Nov 27 2018, 6:42 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata
ArthurPSmith added a comment to T210495: Number of Senses is decreasing on ListeriaBot's report.

I wouldn't be surprised if it's a WDQS problem, this is definitely generated from an RDF query.

Nov 27 2018, 6:39 PM · Wikidata-Query-Service, User-Smalyshev, Wikidata-Campsite, Wikidata

Oct 16 2018

ArthurPSmith added a comment to T160259: [Story] RDF for Lexemes, Forms and Senses.

According to https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/RDF_mapping a lexeme should be "a wikibase:Lexeme " as well as "a ontolex:LexicalEntry", but in the query service I can only find things via the latter relation. Similarly for forms and "wikibase:Form". Something left out of the dump?

Oct 16 2018, 3:09 PM · Wikidata Lexicographical data, Wikidata

Jun 29 2018

ArthurPSmith added a comment to T197145: Create special pages for lexemes.

WDQS works for me! I'm not sure where that is of course - I guess I could check Phabricator!

Jun 29 2018, 6:32 PM · Wikidata, Wikidata Lexicographical data

Jun 19 2018

ArthurPSmith added a comment to T197145: Create special pages for lexemes.

Does "alphabetical" ordering even make sense for words in a collection of vastly different writing systems? If this is done I would recommend it be accompanied by some filtering - for language, part of speech, grammatical features, certain properties perhaps.

Jun 19 2018, 5:59 PM · Wikidata, Wikidata Lexicographical data
ArthurPSmith committed rTPTAB4dfa171976e5: Fix for problem with new time unit being used for half lives - rely on wikidata….
Fix for problem with new time unit being used for half lives - rely on wikidata…
Jun 19 2018, 5:19 PM

Jun 1 2018

ArthurPSmith added a comment to T195740: Decide on a way forward for acceptable languages for lemmas and representations.

I am in general favorable to Micru's proposal, and perhaps Pamputt's elaboration of it above: using wikidata items directly allows representation of the lemma language naturally in the user's own script/language for one, and other automatic bonuses of using items given the structured data ethos etc.. However I'm a little confused about the details of how this would work - specifically, the most commonly used lexemes would usually have the same spelling, use etc. across all variants of a language; do we give that a more general language ("en" = Q1860 say) and only use the specific items mentioned ("en-US" = Q7976, "en-GB" = Q7979, "en-CA" = Q44676, etc.) where there really are variations? Or would it be possible to attach multiple language items to a single lexeme, to indicate it applies to several specific variants?

Jun 1 2018, 3:15 PM · Language codes, Wikidata Lexicographical data, Wikidata

May 29 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

Here's a specific question that might be detailed enough in description: suppose we have a collection of facts (say the names, countries, inception dates, and official websites for a collection of organizations) that has been extracted from multiple sources, including various language wikipedias, a CC-0 data source (for example https://grid.ac/) and a non-CC-0 non-wikipedia data source - these sources would be indicated in wikidata by the reference/source section on each statement. This extraction has been done by users either manually or running bots with the understanding that they are adding facts to a CC-0 database (wikidata). Reconciling the facts - for example merging duplicates with slightly different names, dates, or URL's - has been done by users manually or semi-automatically, again with the understanding they are contributing to a CC-0 database. Are there any copyright or other rights constraints that apply to this collection, or can it be fully considered to legally be CC-0?

May 29 2018, 3:55 PM · WMF-Legal, Wikidata
ArthurPSmith added a comment to T163642: Index Wikidata strings in statements for fulltext search.

Hmm, I'm not sure this is all that useful at least as it stands. Most external id's can be as easily found now via the Wikidata Resolver tool - https://tools.wmflabs.org/wikidata-todo/resolver.php - However, what I would find useful would be a way to locate for example partial street addresses - this (P969) is often entered as a qualifier on headquarters location (P159). Searching for' haswbstatement:P969=Main' now finds something, but only because that oddly has just 'Main' as the value for P969, and making the string lowercase ("main") finds nothing, which is definitely not what I would expect on this... I don't think treating string values as if they were identifiers is the right approach, the usefulness of a search engine is in normalizing string values so you can find them without having the exact matching string. And qualifiers should be folded in somehow!

May 29 2018, 2:52 PM · MW-1.32-notes (WMF-deploy-2018-09-25 (1.32.0-wmf.23)), Discovery-Search (Current work), User-Smalyshev, User-aude, CirrusSearch, Discovery-ARCHIVED, Wikidata

May 28 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

Hi - my most recent response was following MisterSynergy's comment on Denny's proposed questions, and specifically the meaning of "processes that in bulk extract facts from Wikipedia articles," - it sounds like from subsequent discussion that we are not talking solely of automated "processes", so I think I echo MisterSynergy's comment that the question needs to be better defined to "describe how these processes look like". On the one hand there's overall averages, with less than one "fact" per wikipedia article; on the other hand the distribution is probably quite wide, with some articles having dozens of "facts" extracted from them. Since CC-BY-SA applies to each article individually, does extraction of too much factual data from one article potentially violate its copyright?

May 28 2018, 2:30 PM · WMF-Legal, Wikidata

May 26 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

based on the fact that we have ~42M “imported from” references and ~64M sitelinks in Wikidata

May 26 2018, 7:07 PM · WMF-Legal, Wikidata

May 25 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

Some references on why CC0 is essential for a free public database:
https://wiki.creativecommons.org/wiki/CC0_use_for_data
"Databases may contain facts that, in and of themselves, are not protected by copyright law. However, the copyright laws of many jurisdictions cover creatively selected or arranged compilations of facts and creative database design and structure, and some jurisdictions like those in the European Union have enacted additional sui generis laws that restrict uses of databases without regard for applicable copyright law. CC0 is intended to cover all copyright and database rights, so that however data and databases are restricted (under copyright or otherwise), those rights are all surrendered"

May 25 2018, 5:47 PM · WMF-Legal, Wikidata

May 23 2018

ArthurPSmith added a comment to T195382: show Lemma on Special:AllPages.

FYI I agree with VIGNERON on what it should look like - but at least something more than the id!

May 23 2018, 3:44 PM · MW-1.32-notes (WMF-deploy-2018-06-12 (1.32.0-wmf.8)), Wikidata-Editor-Experience-Improvements-Iteration0, Wikidata-Turtles-Sprint #5, Patch-For-Review, Wikidata, Wikidata Lexicographical data

May 22 2018

ArthurPSmith added a comment to T193728: Address concerns about perceived legal uncertainty of Wikidata .

It has been asserted here several times that OSM data has been wholesale imported into Wikidata - do we know that has happened? Wikidata has two properties related to OSM, one that relates wikidata items to OSM tags like "lighthouse", and one that is essentially deprecated (see T145284), so I assume those are not the issue. According to https://www.wikidata.org/wiki/Wikidata:OpenStreetMap (text which has been there since at least last September) "it is not possible to import coordinates from OpenStreetMap to Wikidata". If the issue is coordinates imported via wikipedia infoboxes that originated with OSM, I can see there might be an issue there, and maybe that should be added to Denny's suggested question in some fashion. But as far as actual importing of OSM data, the only specific cases that I noticed explicitly cited above are (A) a bot request that has been rejected, and (B) a discussion from 2013 where the copyright issue was explicitly raised right away.

May 22 2018, 6:59 PM · WMF-Legal, Wikidata

Oct 11 2017

ArthurPSmith committed rTPTAB4c9cb15dbd2e: Group 1 ID has changed due to a merge.
Group 1 ID has changed due to a merge
Oct 11 2017, 3:29 PM

Jul 21 2017

ArthurPSmith added a comment to T171092: WDQS sync (?) issue for certain recently created items.

Of course, now these examples I gave are working - probably because I updated them recently. However, I found more that are not now, or only partially - for example Q2256713:

Jul 21 2017, 3:10 PM · TestMe, Wikidata-Query-Service, Discovery-ARCHIVED, Wikidata

Jul 19 2017

ArthurPSmith created T171092: WDQS sync (?) issue for certain recently created items.
Jul 19 2017, 7:09 PM · TestMe, Wikidata-Query-Service, Discovery-ARCHIVED, Wikidata

Jul 14 2017

ArthurPSmith raised the priority of T54564: Allow sitelinks to redirect pages to fix the 'Bonnie and Clyde problem' from Lowest to Medium.

I don't understand why Multichill can unilaterally alter the priority on this request in the face of an active wikidata RFC where the voting has been 2:1 in support of this change. It would also be nice to get some actual feedback from developers - is this really "against the core data model of Wikdiata"? I don't see it - particularly as the workarounds in place now prove it can be easily supported.

Jul 14 2017, 2:46 PM · User-notice-archive, Wikidata, MediaWiki-extensions-WikibaseRepository

Jul 13 2017

ArthurPSmith added a comment to T170614: constraint gadget always shows an error for P279 (subclass of) statements.

Thanks! I did search through the open tasks first and didn't find anything on this....

Jul 13 2017, 7:07 PM · Wikidata, Wikibase-Quality-Constraints
ArthurPSmith created T170614: constraint gadget always shows an error for P279 (subclass of) statements.
Jul 13 2017, 6:09 PM · Wikidata, Wikibase-Quality-Constraints

Jun 6 2017

ArthurPSmith added a comment to T143486: In some cases, moving or deleting pages on a client wiki does not result in sitelink updates / removal on Wikidata.

The dummy user solution sounds good to me. Magnus Manske is doing something like this with his QuickStatementsBot so maybe a special purpose Bot account on wikidata for this?

Jun 6 2017, 1:54 PM · Wikidata Sitelinks, Wikidata-Campsite, Wikidata

Mar 23 2017

ArthurPSmith added a comment to T150939: Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls.

I believe a way this could be done would be to allow the attachment of regular expressions to the formatter URL, and have the external id URL conversion code understand them. That is, if there was a qualifier property that specified "regex substitution" for example, the ISNI problem (of additional spaces within the id that must be removed for the formatter URL) would be handled by a value something like "s/\s+//g" (remove all spaces). Some of the others might need a "regex match" on the id that allows specifying a $1, $2, $3 grouping pattern, and the formatter URL then looks something like http://...../$1/$2/$3 (or that could also possibly be handled by a substitution as in the ISNI case). The IMDB case is more difficult because it's essential 4 different formatter URLs based on the first characters of the id, so it might need a "regex filter" that limits the scope of each formatter URL based on the id; wikibase would then need to look through the filter regexes to find a matching formatter URL and use that.

Mar 23 2017, 3:37 PM · MediaWiki-extensions-WikibaseRepository, Wikidata

Mar 22 2017

ArthurPSmith added a comment to T150939: Replace https://tools.wmflabs.org/wikidata-externalid-url by providing improved handling for external id formatter urls.

As background, I'm seeing about 2000 "hits" per day on this service right now, with about a dozen properties linking through it to their databases.

Mar 22 2017, 8:23 PM · MediaWiki-extensions-WikibaseRepository, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

@Esc3300 well, I developed this tool because links for IMDB and a handful of other properties were broken when we made the change from string to "external identifier" last year, where the wikidata UI started putting the links in directly (previously it had been done by a javascript gadget - which meant the links wouldn't be available to re-users either). So "work without this tool" would break a lot of stuff in wikidata and for everybody using it.

Mar 22 2017, 8:16 PM · Tools, Wikidata

Mar 21 2017

ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

Hmm, Ok, I read through the discussion you linked with @coren - I certainly see there can be a privacy violation regarding expectations in cases as were discussed there. I think this is a quite different case though (for example, the links are exclusively to third-party sites, not anything I or any other WMF person controls) and would like to hear directly from somebody with WMF (and some voices from wikidata) on this. If there is a clearly posted policy somewhere that would be great too. The policy linked by @coren focused on the Labs user collecting personal information, which is not at all happening here, and said nothing specifically about redirects per se.

Mar 21 2017, 9:08 PM · Tools, Wikidata
ArthurPSmith claimed T160205: Add interstitial to wikidata-externalid-url.

(claiming task - if this really needs to be done I can certainly take care of it)

Mar 21 2017, 8:50 PM · Tools, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

Hmm, I think the big issue may be point 3. Do you have an example where this might have come up? I could certainly make it an interstitial easily enough, but that makes these links a bit less convenient for people (extra click); if the links are being included with or without a warning elsewhere based on the wmflabs URL then I can see how it may be important to address this somehow. Also is there boilerplate text we should use if we really do need to put this in?

Mar 21 2017, 8:48 PM · Tools, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

specifically, looking at The Godfather, which you mention here, there are close to 3 dozen OTHER external id links that similarly would show user IP information if followed.

Mar 21 2017, 7:59 PM · Tools, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

@Dispenser, ok the issue is that people clicking an "external id" link are going to an external site? Is there any situation in which it is not obvious this is going to an external website? Every wikidata item with "external id" values has links directly to third party sites, without any interstitial or warning other than that it is external. I don't see the harm or potential for anybody's expectations of privacy to be violated.

Mar 21 2017, 7:56 PM · Tools, Wikidata
ArthurPSmith added a comment to T160205: Add interstitial to wikidata-externalid-url.

@Dispenser wikidata-externalid-url is installed on tool-labs which fully preserves user privacy, I'm not sure what your concern is? Please clarify where you think any policy has been violated.

Mar 21 2017, 6:53 PM · Tools, Wikidata

Nov 16 2016

ArthurPSmith closed T150803: Information leak on wikidata-externalid-url as Invalid.

@jeblad I'm resolving this as invalid as the initial claim of an information leak seems to be incorrect. However you might want to open up a separate phabricator ticket with your detailed suggestion on how to do formatter URL's better, I think it's a promising approach to allow pulling components from the "regular expression" syntax.

Nov 16 2016, 8:42 PM · Wikidata, Toolforge, Datasets-General-or-Unknown, Privacy
ArthurPSmith added a comment to T150803: Information leak on wikidata-externalid-url.

Ha, if I'd actually looked at the logs I would have known that. Yes all the IP addresses in the file are a 10.68 address, which is locally identified as "tools-proxy....wmflabs" so yes, no external IP addresses are visible to the service.

Nov 16 2016, 3:27 PM · Wikidata, Toolforge, Datasets-General-or-Unknown, Privacy
ArthurPSmith added a comment to T150803: Information leak on wikidata-externalid-url.

Or if there's some privacy agreement to sign as jeblad suggested then I'm happy to do that too. I met Lydia Pintscher in person last week so she can vouch for who I am :)

Nov 16 2016, 1:28 AM · Wikidata, Toolforge, Datasets-General-or-Unknown, Privacy
ArthurPSmith added a comment to T150803: Information leak on wikidata-externalid-url.

There are two basic issues which the url redirect script tackles - ID"s that need cleaning up (such as ISNI that is supposed to be entered as an ID with space characters, but the URL requires the spaces to be removed) and formatter URL's that require some more sophisticated handling than just a single $1 substitution - the IMDB case for example where the first two characters of the id determine the specific formatter URL to be used. It's not clear to me where is the best place for either of those pieces of logic. Wikibase could have some code for this (feel free to import what I've written) which would be perhaps exposed as some sort of service, but anybody using the P1630 values directly wouldn't benefit from that. It's not clear to me where this belongs. For now if there's some protocol for wiping log files or not even recording them on the tool labs server I'd be happy to implement that too. I have no interest in these log files.

Nov 16 2016, 1:23 AM · Wikidata, Toolforge, Datasets-General-or-Unknown, Privacy

Oct 14 2016

ArthurPSmith added a comment to T122706: Create a WDQS-based ElementProvider.

I see you've closed - looks good by the way. Anyway, on the question of
retaining WDQ - no I don't think that's necessary, I think Magnus would
like to shut it down eventually. I don't see that WDQ adds anything to this
tool now SPARQL is working reliably, it's fast and stable. So feel free to
remove...

Oct 14 2016, 8:36 PM · Wikidata, Tool-Wikidata-Periodic-Table

Sep 26 2016

ArthurPSmith added a comment to T143594: Add unit support to WbQuantity.

I'm not sure what the issue is here - you can enter a unit URL via the WbQuantity initializer (unit = 'http://www.wikidata.org/entity/Q....') and it works fine. The documentation in __init__.py seems to be out of date on this though.

Sep 26 2016, 2:04 PM · Patch-For-Review, Pywikibot-Wikidata, Pywikibot

Sep 23 2016

ArthurPSmith added a comment to T120452: Allow structured datasets on a central repository (CSV, TSV, JSON, GeoJSON, XML, ...).

@Yurik and all, I'm glad to see all this work going on, I was pointed to this after I made a comment on a wikidata property proposal that I thought would be best addressed by somehow allowing a tabular data value rather than a single value. However, I'm wondering if this might be best driven by specific problem cases rather than trying to tackle generic "data" records. One of the most common needs is for time-series data: population of a city vs time, for instance, economic data by point in time, physical data like temperature vs time, etc. The simplest extension beyond the single value allowed by wikidata would be to allow a set of pairs defined by two wikidata properties (eg. P585 - "point in time", P1082 - "population"). The relation to wikidata takes care of localization (those properties have labels in many different languages) and defines the value types (time and quantity in this case), and the dataset would somehow be a statement attached to a wikidata item (eg. a particular city) so that the item and pair of properties fully define the meaning of the collection of pairs. The underlying structure of the pairs doesn't really matter much. But there seems to be something missing here - I think it might be best addressed in wikidata itself...

Sep 23 2016, 8:06 PM · Analytics-Radar, Commons-Datasets, MW-1.27-release (WMF-deploy-2016-04-26_(1.27.0-wmf.22)), Crosswiki, Multimedia, Commons, Wikidata, Community-Wishlist-Survey-2015
ArthurPSmith added a comment to T142432: ptable app is broken again!.

Excellent, thanks! I probably should have sent you an email...

Sep 23 2016, 3:40 PM · Wikidata, Tool-Wikidata-Periodic-Table

Aug 10 2016

ArthurPSmith triaged T142432: ptable app is broken again! as High priority.

So I updated to https in my local copy and that definitely fixed the problem. Not sure if @Ricordisamoa is around? I don't have permission right now to do anything with ptable, but I do have an account (apsmith) on tools.wmflabs.org so if I was in the right group I could help out here maybe...

Aug 10 2016, 5:05 PM · Wikidata, Tool-Wikidata-Periodic-Table
ArthurPSmith added a comment to T142432: ptable app is broken again!.

Still broken (at least 3 days now). I can't see the error messages but I tried running my own copy and ran into:

Aug 10 2016, 4:57 PM · Wikidata, Tool-Wikidata-Periodic-Table

Aug 8 2016

ArthurPSmith created T142432: ptable app is broken again!.
Aug 8 2016, 8:39 PM · Wikidata, Tool-Wikidata-Periodic-Table

Jul 11 2016

ArthurPSmith added a comment to T112140: Provide a wrapper function in pywikibot around wbparsevalue.

Ok, the WbRepresentation superclass looks like it might help simplify this. But FilePage, ItemPage and PropertyPage (and basestring) are not subclasses of that, so I think just returning the json hash would be best there. But the function could certainly run fromWikibase for the other types, that seems pretty easy, I'll look into that.

Jul 11 2016, 2:12 PM · Pywikibot, Pywikibot-Wikidata

Jul 7 2016

ArthurPSmith added a comment to T112140: Provide a wrapper function in pywikibot around wbparsevalue.

@Multichill - could be, I'm not familiar with WbTime other than a glance at the code. Are there edge cases (eg. 10^20 years into the future?) that would break the "int/long" assumptions? But it definitely does NOT work for WbQuantity the way things currently are. Fixing WbQuantity seemed to be out of scope here, though it does need to be done. Coordinate may have similar issues as it uses floats.

Jul 7 2016, 8:10 PM · Pywikibot, Pywikibot-Wikidata
ArthurPSmith added a comment to T112140: Provide a wrapper function in pywikibot around wbparsevalue.

>>! In T112140#2435122, @Multichill wrote:

The function should return an object. Possibilities seem to be commonsMedia, globe-coordinate, monolingualtext, quantity, string, time, url, external-id, wikibase-item, wikibase-property, math

Jul 7 2016, 1:45 PM · Pywikibot, Pywikibot-Wikidata

Jul 6 2016

ArthurPSmith added a comment to T112140: Provide a wrapper function in pywikibot around wbparsevalue.

See https://gerrit.wikimedia.org/r/#/c/297637/ for proposed implementation...

Jul 6 2016, 8:02 PM · Pywikibot, Pywikibot-Wikidata

Jul 5 2016

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

Ok, that echoes something Tobias has said also about using strings and avoiding IEEE fp. I'm going to look at getting T112140 working first and then see if I can bring that implementation to bear on this.

Jul 5 2016, 3:06 PM · Pywikibot, Pywikibot-Wikidata, Wikidata
ArthurPSmith claimed T112140: Provide a wrapper function in pywikibot around wbparsevalue.

I'm going to have a shot at implementing this - it looks like it will be useful for a number of other open phabricator issues for pywikibot. I was figuring a function that will take all the parameters the API offers (datatype - a string, values - a list of strings, options - a dict, validate - boolean). Any other recommendations?

Jul 5 2016, 2:49 PM · Pywikibot, Pywikibot-Wikidata

Jul 4 2016

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

You're the one who brought up JSON! It sounds like the issue is something different though - internal representation as strings? Anyway, are you recommending pywikibot use the wbparsevalue API for all (or at least numerical) input? That could be a good idea. Looks like it there was already a phabricator ticket on this - T112140

Jul 4 2016, 11:36 PM · Pywikibot, Pywikibot-Wikidata, Wikidata

Jul 2 2016

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

That restriction is NOT in the JSON spec: http://tools.ietf.org/html/rfc7159.html#section-6 - also the leading plus is not required by JSON. Is there some other reason for the limitation in the wikidata code? DataValues is a wikidata-specific PHP library right? I can't think of any good reason to keep this limitation on input values.

Jul 2 2016, 12:02 PM · Pywikibot, Pywikibot-Wikidata, Wikidata

Jul 1 2016

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

Hmm. So is it a pywikibot problem or a wikibase API problem? Is pywikibot sending in JSON format?

Jul 1 2016, 5:20 PM · Pywikibot, Pywikibot-Wikidata, Wikidata
ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

As far as testing goes, I have (in my own copy) added the following to the pywikibot tests/wikibase_edit_tests.py file (within the class TestWikibaseMakeClaim):

Jul 1 2016, 2:59 PM · Pywikibot, Pywikibot-Wikidata, Wikidata

Jun 27 2016

ArthurPSmith added a comment to T119226: Very small (or very large) quantity values (represented in scientific notation) result in error in add/update via pywikibot/wikidata API.

Please note this is still an issue with the latest pywikibot code and current wikidata release - as of June 23, 2016. The following is the fix I have in the pywikibot core pywikibot/__init__.py file:

Jun 27 2016, 5:37 PM · Pywikibot, Pywikibot-Wikidata, Wikidata

Apr 28 2016

ArthurPSmith committed rTPTABf196998d24ce: Added a wikidata-based "chart of the nuclides" under the periodic table app….
Added a wikidata-based "chart of the nuclides" under the periodic table app…
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTAB69dffd4cb729: Added a wikidata-based "chart of the nuclides" under the periodic table app….
Added a wikidata-based "chart of the nuclides" under the periodic table app…
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTABcad0ce558c9f: Added a wikidata-based "chart of the nuclides" under the periodic table app….
Added a wikidata-based "chart of the nuclides" under the periodic table app…
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTAB9f25f1f91c73: Added a Wikidata-based "chart of the nuclides" under /nuclides.
Added a Wikidata-based "chart of the nuclides" under /nuclides
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTAB5a33c573599d: Added a Wikidata-based "chart of the nuclides" under /nuclides.
Added a Wikidata-based "chart of the nuclides" under /nuclides
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTAB96948e4c816b: Added a Wikidata-based "chart of the nuclides" under /nuclides.
Added a Wikidata-based "chart of the nuclides" under /nuclides
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTABff80e0efd65c: Added a Wikidata-based "chart of the nuclides" under /nuclides.
Added a Wikidata-based "chart of the nuclides" under /nuclides
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTAB85f311ab7beb: Added a Wikidata-based "chart of the nuclides" under /nuclides.
Added a Wikidata-based "chart of the nuclides" under /nuclides
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTABf2a6299a2280: Added a Wikidata-based "chart of the nuclides" under /nuclides.
Added a Wikidata-based "chart of the nuclides" under /nuclides
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTABaf6f8b8bdcba: Fixes for nuclides to handle problem of duplicate (and some wrong) returns….
Fixes for nuclides to handle problem of duplicate (and some wrong) returns…
Apr 28 2016, 6:11 AM
ArthurPSmith committed rTPTABd47ac7f9c34d: Updated nuclides charts to display using SVG; includes some refactoring into….
Updated nuclides charts to display using SVG; includes some refactoring into…
Apr 28 2016, 6:11 AM