Tue, Apr 16
Let me spell out my use case in more detail then.
Mon, Apr 15
Ok great! I'll move the field to the end and try to make Jenkins happy then.
Sure, it also makes sense to add full text URLs.
Tue, Apr 9
@Lydia_Pintscher we would need your thoughts about this.
This is nice! However, when visualizing properties by category, it seems that subclasses are not taken into account: only the properties bearing that exact category as P31 value are listed. This gives a pretty inaccurate view: it is crucial to respect this hierarchy, just like the prop-explorer tool does:
Mon, Apr 8
Tue, Apr 2
I agree with @Nicolastorzec above.
@Smalyshev okay! Sorry if this is not the right place: I would be happy to migrate the patch to another ticket. Indeed this only adds entity-level metadata, not dump-level metadata. I think this would be less of a breaking change, given that it does not require changing the dump structure (and of course it is more useful to me, haha!)
I think Wikidata-Toolkit could be used for that:
Obviously it would mean making sure the RDF serialization produced by it is consistent with what is being fed in WDQS at the moment.
I am wondering what is the status of this: is more discussion needed about what version information to include, or are we simply waiting for a patch? I vote for the revision id to serve as version id (possibly with other metadata such as timestamp, as in Special:EntityData). If there is consensus for that, and if directed to the relevant part of the code, I could contribute patch.
Concerning the dumps, it should be possible to add versioning information on a per-entity basis, for instance by adding the revision id in the JSON serialization of the entity, as is currently done in Special:EntityData. This would arguably be more useful than a per-dump versioning, given that the dump generation process is not atomic. It would also be less of a breaking change: it would just amount to make JSON serialization of entities more uniform. This is debated in T87283.
Sun, Mar 24
Just confirming that the bug has occurred today again and the proposed fix worked perfectly. Thanks again!
Woaw, thanks for the very thorough analysis! I cannot reproduce this anymore. Thanks for the GOMAXPROCS trick! I will add it to the docs.
Sat, Mar 23
Fri, Mar 22
Mar 20 2019
I would also be interested in this, spcifically for Wikidata where the diff structure could be exploited even further as suggested by @Yair_rand.
Mar 10 2019
Mar 6 2019
Feb 27 2019
If you need a mapping from ISO language codes to Wikimedia ones, Wikidata-Toolkit has such a mapping: https://github.com/Wikidata/Wikidata-Toolkit/blob/3e62f93b137c25961c5a12172c7f213a720ecb67/wdtk-datamodel/src/main/java/org/wikidata/wdtk/datamodel/interfaces/WikimediaLanguageCodes.java
What exactly would you do with this information? i.e. what's the actual use case that makes you file this request?
Feb 22 2019
What is the protocol to go forward on this? Should we hold a RFC on-wiki to let people choose among the possible solutions above?
Feb 21 2019
We have this problem in https://dissem.in/ . This project is set up on Translatewiki, the code is hosted on GitHub and uses Travis for CI. We use Django's localization system which is based on gettext. We compile messages in the CI to check that they are valid. Sometimes translators add incorrect translations (such as translations not reusing the same variables as the msgid, or in a different format). This breaks our build as any incorrect translation will stop the entire compilation process. It is not clear if and how it would be possible to configure the translation compilation process to ignore invalid messages.
Feb 19 2019
Feb 11 2019
Any help with finishing the migration is welcome of course, I am currently busy with dissemin but I will try to come back to this at some point.
@Samwalton9 yes that is due to me starting the migration… and not completing it yet!
Feb 2 2019
I have updated the Wikibase data model docs, which incorrectly mentioned precisions of hours, minutes and seconds. I assume that they were there because they were part of an earlier design?
Jan 25 2019
Useful solution from Nikki: add in your common.css:
Jan 9 2019
I have pinged a few interface admins on wiki to enable this.
Jan 7 2019
Oh can they? Sorry I had no idea! Thanks, I will try to enable it myself.
Jan 5 2019
I currently use my own custom hacky script to create properties, but having something stable and usable by anyone would be highly beneficial.
Dec 2 2018
@Lucas_Werkmeister_WMDE thank you very much for that!
Nov 12 2018
I have taken the liberty to remove "Cloud Services" as a subscriber to this ticket as I do not think every toollabs user wants to receive notifications about this.
Nov 6 2018
Nov 5 2018
As explained in T164152 I am happy to mentor anyone for this.
@Daniel_Mietchen regarding https://twitter.com/EvoMRI/status/1055785761574813696 (I do not read Twitter notifications - but happily interact on open platforms such as Mastodon):
Nov 2 2018
The search interface can also be used for that thanks to the haswbstatement command. That only gets you one id per query, so it might not be suited for all tools. I don't know if the lag is lower in this interface.
Retrieving items by identifiers is quite crucial in many tools so it would be useful to have a solid interface for that instead of relying on SPARQL (which feels indeed like using a sledgehammer to crack a nut).
@Gehel my service has been quite unstable for some time, but I haven't found the time yet to find out exactly where the problem is coming from - it could be SPARQL, the Wikidata API, redis or the webservice itself. I will add a few more metrics to understand what is going on and report back here.
Nov 1 2018
@Criscod yes that would be a great idea.
Oct 31 2018
Thanks for the ping Lydia! On the top of my mind, the only uses of SPARQL in the tools I maintain are in the openrefine-wikidata interface:
- queries to retrieve the list of subclasses of a given class - lag is not critical at all for this as the ontology is assumed to be stable. (These results are cached on my side for 24 hours, for any root class.)
- queries to retrieve items by external identifiers or sitelinks - lag can be more of an issue for this but I would not consider it critical. (These results are not cached.)
What matters much more for this tool is getting quick results and as little downtime as possible - lag is not really a concern.
Oct 29 2018
Just to let you know that the problem with the ".0" will be solved in the next version of OpenRefine.
In the meantime, you can solve the issue by transforming your column with the following expression: value.toString().replace(".0",""). Hope it helps!
Oct 27 2018
So I had the opportunity to annoy a lot of people by shouting OpenRefine repeatedly in their ears over the past 48 hours.
Oct 26 2018
Awesome! \o/ Actually OpenRefine could potentially help you already at that stage to do the matching - let me know if you want a quick demo :)
I would be happy to help I have a tshirt with an OpenRefine logo (the blue diamond)
I have left some ideas here:
Oct 24 2018
I will be available to help with OpenRefine. It is exactly designed for this workflow indeed so I hope it will be a match :)
For reconciliation help, have you seen this page?
I would be interested in helping with this - I can guide you through the uploading process with OpenRefine.
If you want to prepare for this, I feel free to download OpenRefine have a look at tutorials, like these:
The videos at http://openrefine.org/ are also useful to get an idea of what OpenRefine does (with no reference to Wikidata).
Oct 19 2018
Some of the OpenRefine edits were not tagged during development but all edits done with a released version should be. Some of the OpenRefine batches are uploaded via QuickStatements, in which case they are tagged as such. (The main benefits of using QS with OpenRefine is to run batches in the background or to have a statement matching rules when updating existing claims).
Oct 13 2018
Sure, happy to help any time! (Online or at the Wiki TechStorm)
Oct 12 2018
I think this ticket can be closed given that we cannot figure out what it is supposed to be about.
Sep 28 2018
Sep 25 2018
Sep 19 2018
I was thinking of the opposite: consider the violations related to the revision R of the item I to be the violations of the statements of I with respect to the state of Wikidata just before R+1 was saved.
@Lydia_Pintscher yes indeed! For instance the aggregation at batch-level would probably not be meaningful for inverse constraints (unless there is a way to detect all the violations added and solved by an edit, not just on the item where the edit was made). But isn't this a problem that you have anyway, even when storing only the latest violations? For instance, if I add a "subclass of (P279)" statement between two items, don't you need to recompute type violations for all items which are instances of some transitive subclass of the new subclass? I am not sure how this invalidation is done at the moment.
Sep 18 2018
@Lydia_Pintscher personally here is what I would concretely implement in the EditGroups tool. For each edit that is part of an edit group:
- fetch the constraints violations before and after the edit (this fetching would happen as the edit is retrieved, so in near real-time)
- compute the difference of constraints violations of each type (for instance, 1 new "value type constraint" violation and 2 less "statement required constraint" violation)
- aggregate these statistics at a batch level and expose them in batch views (for instance, this batch added 342 new "value type constraint" violations and solved 764 "statement required constraint" violations)
Together with the number of reverted edits in a batch (which the tool already aggregates), this could potentially make it easier to spot problematic batches.
Sep 17 2018
This ticket is fantastic news.
Sep 16 2018
@martin.monperrus see my first comment in this thread.
@martin.monperrus Let me emphasize that this is a significant change that should get community approval first. There has already been a lot of discussion about similar changes to the DOI template on the English Wikipedia and there is clearly a consensus against this IMHO.
Sep 14 2018
@aborrero thanks for the ping. I do not recognize the shape of the queries as coming from this tool though. The openrefine-wikidata tool should do relatively few SPARQL queries, whose results are cached in redis. How did you determine that this tool is the source of the problem?
Sep 5 2018
@Lydia_Pintscher @Ladsgroup any idea how I could be notified of any new automatic edit summaries, such as the wbeditentity-create-item that this change introduced? For any such summary, I need to add it to EditGroups, especially if the new auto summary replaces a highly-used existing one, as in this case. Otherwise, this breaks the tagging of batches.
I think reworking this implementation would be very welcome because at the moment it is not pretty, to say it politely.
But I am not convinced by the alternative either. Why would Reference inherit from BaseClaim? A reference is not a claim. What would the getSnakType method mean when called on a Reference?
It might be worth giving the bot author some control over this feature:
- there should be some opt-in / opt-out mechanism
- there should be some control over what constitutes a batch. Some users might want to create multiple logical batches during the same run of a bot, or share the same batch id across consecutive runs of the same python script (for instance if it is called by a bash script…
Jul 27 2018
Etalab (who runs the open data portal of the French government) have released a statement (in French) concerning the attribution requirement of their "licence ouverte", confirming that it only applies to the first re-user.
Jul 17 2018
@Chicocvenancio I agree with Yury - it makes it significantly harder to deploy Django projects.
Jun 18 2018
This would be very useful for T197588. It would make a lot of sense for Wikibase Quality Constraints in particular.