That's great! Is there a list of the supported external ids?
Tue, May 21
Is this the same thing as T223803?
@LucasWerkmeister has allocated a machine for that. There are a number of changes that we would like to make to OpenRefine in order to make it more suitable to be hosted online (such as connecting to Wikidata by OAuth). None of them should be blocking for a proof of concept though (especially since you will probably not use Wikidata editing from OpenRefine in the scope of Mortar).
We have some tutorials for OpenRefine and Wikidata, although not targeted for Commons users
- WikiCite 2018 tutorial session: https://www.youtube.com/watch?v=xORXSkE0yCE
- Demo at WikidataCon 2017: https://www.youtube.com/watch?v=kIpPZyZp1kI
- Screencast about reconciliation in OpenRefine: https://www.youtube.com/playlist?list=PL_0jeq3PjvtADzbovAgHNzOFvOlyF6uL1
Ah nevermind there is already a link, I just did not look on the right-hand side.
Sun, May 19
The EditGroups external tool now has documentation for developers, explaining the current infrastructure: https://editgroups.readthedocs.io/en/latest/architecture.html.
I would be happy to expand on the points that are unclear or missing.
Thu, May 16
Thank you so much @Lucas_Werkmeister_WMDE for working on this!
Mon, May 13
Thanks all for your patience for this! Excited to see my first commit making it into Wikibase \o/
Apr 25 2019
Apr 24 2019
Thanks for the ping! I am not using wb_terms in any of my projects. Good luck with the migration, it looks epic.
Apr 16 2019
Let me spell out my use case in more detail then.
Apr 15 2019
Ok great! I'll move the field to the end and try to make Jenkins happy then.
Sure, it also makes sense to add full text URLs.
Apr 9 2019
@Lydia_Pintscher we would need your thoughts about this.
This is nice! However, when visualizing properties by category, it seems that subclasses are not taken into account: only the properties bearing that exact category as P31 value are listed. This gives a pretty inaccurate view: it is crucial to respect this hierarchy, just like the prop-explorer tool does:
Apr 8 2019
Apr 2 2019
I agree with @Nicolastorzec above.
@Smalyshev okay! Sorry if this is not the right place: I would be happy to migrate the patch to another ticket. Indeed the patch only adds entity-level metadata, not dump-level metadata. I think this would be less of a breaking change, given that it does not require changing the dump structure (and of course it is more useful to me, haha!)
I think Wikidata-Toolkit could be used for that:
Obviously it would mean making sure the RDF serialization produced by it is consistent with what is being fed in WDQS at the moment.
I am wondering what is the status of this: is more discussion needed about what version information to include, or are we simply waiting for a patch?
Concerning the dumps, it should be possible to add versioning information on a per-entity basis, for instance by adding the revision id in the JSON serialization of the entity, as is currently done in Special:EntityData. This would arguably be more useful than a per-dump versioning, given that the dump generation process is not atomic. It would also be less of a breaking change: it would just amount to make JSON serialization of entities more uniform. This is debated in T87283.
Mar 24 2019
Just confirming that the bug has occurred today again and the proposed fix worked perfectly. Thanks again!
Woaw, thanks for the very thorough analysis! I cannot reproduce this anymore. Thanks for the GOMAXPROCS trick! I will add it to the docs.
Mar 23 2019
Mar 22 2019
Mar 20 2019
I would also be interested in this, spcifically for Wikidata where the diff structure could be exploited even further as suggested by @Yair_rand.
Mar 10 2019
Mar 6 2019
Feb 27 2019
If you need a mapping from ISO language codes to Wikimedia ones, Wikidata-Toolkit has such a mapping: https://github.com/Wikidata/Wikidata-Toolkit/blob/3e62f93b137c25961c5a12172c7f213a720ecb67/wdtk-datamodel/src/main/java/org/wikidata/wdtk/datamodel/interfaces/WikimediaLanguageCodes.java
What exactly would you do with this information? i.e. what's the actual use case that makes you file this request?
Feb 22 2019
What is the protocol to go forward on this? Should we hold a RFC on-wiki to let people choose among the possible solutions above?
Feb 21 2019
We have this problem in https://dissem.in/ . This project is set up on Translatewiki, the code is hosted on GitHub and uses Travis for CI. We use Django's localization system which is based on gettext. We compile messages in the CI to check that they are valid. Sometimes translators add incorrect translations (such as translations not reusing the same variables as the msgid, or in a different format). This breaks our build as any incorrect translation will stop the entire compilation process. It is not clear if and how it would be possible to configure the translation compilation process to ignore invalid messages.
Feb 19 2019
Feb 11 2019
Any help with finishing the migration is welcome of course, I am currently busy with dissemin but I will try to come back to this at some point.
@Samwalton9 yes that is due to me starting the migration… and not completing it yet!
Feb 2 2019
I have updated the Wikibase data model docs, which incorrectly mentioned precisions of hours, minutes and seconds. I assume that they were there because they were part of an earlier design?
Jan 25 2019
Useful solution from Nikki: add in your common.css:
Jan 9 2019
I have pinged a few interface admins on wiki to enable this.
Jan 7 2019
Oh can they? Sorry I had no idea! Thanks, I will try to enable it myself.
Jan 5 2019
I currently use my own custom hacky script to create properties, but having something stable and usable by anyone would be highly beneficial.
Dec 2 2018
@Lucas_Werkmeister_WMDE thank you very much for that!
Nov 12 2018
I have taken the liberty to remove "Cloud Services" as a subscriber to this ticket as I do not think every toollabs user wants to receive notifications about this.
Nov 6 2018
Nov 5 2018
As explained in T164152 I am happy to mentor anyone for this.
@Daniel_Mietchen regarding https://twitter.com/EvoMRI/status/1055785761574813696 (I do not read Twitter notifications - but happily interact on open platforms such as Mastodon):
Nov 2 2018
The search interface can also be used for that thanks to the haswbstatement command. That only gets you one id per query, so it might not be suited for all tools. I don't know if the lag is lower in this interface.
Retrieving items by identifiers is quite crucial in many tools so it would be useful to have a solid interface for that instead of relying on SPARQL (which feels indeed like using a sledgehammer to crack a nut).
@Gehel my service has been quite unstable for some time, but I haven't found the time yet to find out exactly where the problem is coming from - it could be SPARQL, the Wikidata API, redis or the webservice itself. I will add a few more metrics to understand what is going on and report back here.
Nov 1 2018
@Criscod yes that would be a great idea.
Oct 31 2018
Thanks for the ping Lydia! On the top of my mind, the only uses of SPARQL in the tools I maintain are in the openrefine-wikidata interface:
- queries to retrieve the list of subclasses of a given class - lag is not critical at all for this as the ontology is assumed to be stable. (These results are cached on my side for 24 hours, for any root class.)
- queries to retrieve items by external identifiers or sitelinks - lag can be more of an issue for this but I would not consider it critical. (These results are not cached.)
What matters much more for this tool is getting quick results and as little downtime as possible - lag is not really a concern.
Oct 29 2018
Just to let you know that the problem with the ".0" will be solved in the next version of OpenRefine.
In the meantime, you can solve the issue by transforming your column with the following expression: value.toString().replace(".0",""). Hope it helps!
Oct 27 2018
So I had the opportunity to annoy a lot of people by shouting OpenRefine repeatedly in their ears over the past 48 hours.
Oct 26 2018
Awesome! \o/ Actually OpenRefine could potentially help you already at that stage to do the matching - let me know if you want a quick demo :)
I would be happy to help I have a tshirt with an OpenRefine logo (the blue diamond)
I have left some ideas here:
Oct 24 2018
I will be available to help with OpenRefine. It is exactly designed for this workflow indeed so I hope it will be a match :)
For reconciliation help, have you seen this page?
I would be interested in helping with this - I can guide you through the uploading process with OpenRefine.
If you want to prepare for this, I feel free to download OpenRefine have a look at tutorials, like these:
The videos at http://openrefine.org/ are also useful to get an idea of what OpenRefine does (with no reference to Wikidata).
Oct 19 2018
Some of the OpenRefine edits were not tagged during development but all edits done with a released version should be. Some of the OpenRefine batches are uploaded via QuickStatements, in which case they are tagged as such. (The main benefits of using QS with OpenRefine is to run batches in the background or to have a statement matching rules when updating existing claims).
Oct 13 2018
Sure, happy to help any time! (Online or at the Wiki TechStorm)
Oct 12 2018
I think this ticket can be closed given that we cannot figure out what it is supposed to be about.
Sep 28 2018
Sep 25 2018
Sep 19 2018
I was thinking of the opposite: consider the violations related to the revision R of the item I to be the violations of the statements of I with respect to the state of Wikidata just before R+1 was saved.
@Lydia_Pintscher yes indeed! For instance the aggregation at batch-level would probably not be meaningful for inverse constraints (unless there is a way to detect all the violations added and solved by an edit, not just on the item where the edit was made). But isn't this a problem that you have anyway, even when storing only the latest violations? For instance, if I add a "subclass of (P279)" statement between two items, don't you need to recompute type violations for all items which are instances of some transitive subclass of the new subclass? I am not sure how this invalidation is done at the moment.
Sep 18 2018
@Lydia_Pintscher personally here is what I would concretely implement in the EditGroups tool. For each edit that is part of an edit group:
- fetch the constraints violations before and after the edit (this fetching would happen as the edit is retrieved, so in near real-time)
- compute the difference of constraints violations of each type (for instance, 1 new "value type constraint" violation and 2 less "statement required constraint" violation)
- aggregate these statistics at a batch level and expose them in batch views (for instance, this batch added 342 new "value type constraint" violations and solved 764 "statement required constraint" violations)
Together with the number of reverted edits in a batch (which the tool already aggregates), this could potentially make it easier to spot problematic batches.
Sep 17 2018
This ticket is fantastic news.