Woooohooooo!!!!
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 27 2018
So I had the opportunity to annoy a lot of people by shouting OpenRefine repeatedly in their ears over the past 48 hours.
Oct 26 2018
Awesome! \o/ Actually OpenRefine could potentially help you already at that stage to do the matching - let me know if you want a quick demo :)
I would be happy to help I have a tshirt with an OpenRefine logo (the blue diamond)
I have left some ideas here:
Oct 24 2018
I will be available to help with OpenRefine. It is exactly designed for this workflow indeed so I hope it will be a match :)
For reconciliation help, have you seen this page?
https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation
I would be interested in helping with this - I can guide you through the uploading process with OpenRefine.
If you want to prepare for this, I feel free to download OpenRefine have a look at tutorials, like these:
- https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine/Editing/Tutorials/Basic_editing
- https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine/Editing/Tutorials/Video
The videos at http://openrefine.org/ are also useful to get an idea of what OpenRefine does (with no reference to Wikidata).
Oct 19 2018
Some of the OpenRefine edits were not tagged during development but all edits done with a released version should be. Some of the OpenRefine batches are uploaded via QuickStatements, in which case they are tagged as such. (The main benefits of using QS with OpenRefine is to run batches in the background or to have a statement matching rules when updating existing claims).
Oct 13 2018
Sure, happy to help any time! (Online or at the Wiki TechStorm)
Oct 12 2018
I think this ticket can be closed given that we cannot figure out what it is supposed to be about.
Sep 28 2018
Sep 25 2018
Sep 19 2018
I was thinking of the opposite: consider the violations related to the revision R of the item I to be the violations of the statements of I with respect to the state of Wikidata just before R+1 was saved.
@Lydia_Pintscher yes indeed! For instance the aggregation at batch-level would probably not be meaningful for inverse constraints (unless there is a way to detect all the violations added and solved by an edit, not just on the item where the edit was made). But isn't this a problem that you have anyway, even when storing only the latest violations? For instance, if I add a "subclass of (P279)" statement between two items, don't you need to recompute type violations for all items which are instances of some transitive subclass of the new subclass? I am not sure how this invalidation is done at the moment.
Sep 18 2018
@Lydia_Pintscher personally here is what I would concretely implement in the EditGroups tool. For each edit that is part of an edit group:
- fetch the constraints violations before and after the edit (this fetching would happen as the edit is retrieved, so in near real-time)
- compute the difference of constraints violations of each type (for instance, 1 new "value type constraint" violation and 2 less "statement required constraint" violation)
- aggregate these statistics at a batch level and expose them in batch views (for instance, this batch added 342 new "value type constraint" violations and solved 764 "statement required constraint" violations)
Together with the number of reverted edits in a batch (which the tool already aggregates), this could potentially make it easier to spot problematic batches.
Sep 17 2018
This ticket is fantastic news.
Sep 16 2018
@martin.monperrus see my first comment in this thread.
@martin.monperrus Let me emphasize that this is a significant change that should get community approval first. There has already been a lot of discussion about similar changes to the DOI template on the English Wikipedia and there is clearly a consensus against this IMHO.
Sep 14 2018
@aborrero thanks for the ping. I do not recognize the shape of the queries as coming from this tool though. The openrefine-wikidata tool should do relatively few SPARQL queries, whose results are cached in redis. How did you determine that this tool is the source of the problem?
Sep 5 2018
@Lydia_Pintscher @Ladsgroup any idea how I could be notified of any new automatic edit summaries, such as the wbeditentity-create-item that this change introduced? For any such summary, I need to add it to EditGroups, especially if the new auto summary replaces a highly-used existing one, as in this case. Otherwise, this breaks the tagging of batches.
I think reworking this implementation would be very welcome because at the moment it is not pretty, to say it politely.
But I am not convinced by the alternative either. Why would Reference inherit from BaseClaim? A reference is not a claim. What would the getSnakType method mean when called on a Reference?
It might be worth giving the bot author some control over this feature:
- there should be some opt-in / opt-out mechanism
- there should be some control over what constitutes a batch. Some users might want to create multiple logical batches during the same run of a bot, or share the same batch id across consecutive runs of the same python script (for instance if it is called by a bash script…
Jul 27 2018
Etalab (who runs the open data portal of the French government) have released a statement (in French) concerning the attribution requirement of their "licence ouverte", confirming that it only applies to the first re-user.
https://github.com/etalab/wiki-data-gouv#point-juridique
Jul 17 2018
@Chicocvenancio I agree with Yury - it makes it significantly harder to deploy Django projects.
Jun 18 2018
This would be very useful for T197588. It would make a lot of sense for Wikibase Quality Constraints in particular.
One other approach to this problem would be to consider that these manifest files are not expected to be necessarily hosted by the Wikibase instance itself - these configuration files could be user-contributed and hosted anywhere (or derived automatically from the Wikibase Registry). The downside is that this requires more work from the community (users need to maintain these manifest files themselves) but it could be necessary if we want to include things like URLs of external tools like QuickStatements.
A sample of what such a manifest could look like is here:
https://gist.github.com/despens/d6ae4110c4e97944ddba29f23d78899f
It could be served at a predictable location for each wikibase instance - such as, for instance,
https://www.wikidata.org/manifest-v0.1.json
or something similar
Jun 15 2018
Jun 4 2018
Just noting that this prevents us from adding examples on lexeme-related properties, such as https://www.wikidata.org/wiki/Property:P5244.
Jun 2 2018
Thanks for adding me in the loop! @RazShuty, do you mean any of this?
- migrate the existing reconciliation service (https://tools.wmflabs.org/openrefine-wikidata/) to work on any Wikibase install
- create a Wikibase extension that already provides a reconciliation API natively, without having to create a wrapper like I did
*anything else?
May 28 2018
I have observed this bug multiple times now (also using Firefox).
May 27 2018
May 20 2018
May 19 2018
May 18 2018
After discussion with @Tpt, for now we are just going to change Wikidata-Toolkit's behaviour to use 0 in the After parameter as well… but that's just because it's really hard to shift the default now.
Oh I meant 10:30, fixing that now
@bcampbell that would be nice! but only if it's not too much effort :)
May 17 2018
As a lower hanging fruit, we can also "run OAbot on Wikidata", which would basically mean importing the ids to publication items. @Tpt and I started making a distributed game for that but I think a lot of these could be fully automated. That's a good hackathon-style project if anybody is interested.
May 16 2018
When running software on localhost, the client needs to have OAuth consumer credentials, which are supposed to be private. If I apply for an OAuth consumer for OpenRefine, I cannot put the credentials in OpenRefine's source code, because it would allow anyone to reuse them for any other application. So every user would need to go through the OAuth registration themselves (and then OAuth login).
Note to self: for this we would need to rethink Wikidata authentication in OpenRefine, migrating it to OAuth. This would include adding OAuth support in Wikidata-Toolkit. This has not been done yet because OAuth is not suited for open source software that is run directly by the user on their own machine.
May 15 2018
As soon as this is supported by the Wikibase API, then it makes sense to build support for this directly in Wikidata-Toolkit. This is something that would be massively useful for many people.
May 7 2018
I won't work on this for the next 2 weeks, the floor is yours!
@Nemo_bis that's probably because the edits were cached and generated by an earlier version
May 5 2018
It would be fantastic to have more meaningful edit summaries with wbeditentity. It's of course hard to do this in general, but it would be great to have this for some common cases where a short summary seems doable (adding multiple statements with the same property, for instance).
Both properties mentioned above have been created in the mean time:
So this constraint could be useful, I think.
May 4 2018
I am glad I got the discussion going then: you now have one concrete example to look at (or maybe two? you did not comment on PMC). I think it is fair to say that this is not exactly an isolated case (but I am surprised that you seem (to pretend) not to know? Maybe for legal reasons?) How do you think these problematic uploads should be treated? If there a drift between the practices of the community and the rules of the project, that problem should be solved.
The import was discussed at various places, including at the data import hub, the property talk page and my talk page.
I think there are plenty of examples of non-CC0 data being imported in Wikidata.
Apr 11 2018
Apr 9 2018
Mar 23 2018
Okay - I don't really know much about the MySQL ecosystem to be honest so I cannot really judge (I use postgres when I can). I haven't run into performance issues with pymysql yet - if you are worried about the impact of this maybe we can wait and see if my app (https://tools.wmflabs.org/editgroups/) scales fine in this state first.
Mar 22 2018
Hmmm… I am not sure I understand your reaction… Are you opposing this addition, then? Should I keep monkey-patching my libraries to use the pure python alternative? In that case, why is libmysqlclient-dev included in the Python 2 docker image in the first place?
Mar 21 2018
https://pypi.python.org/pypi/mysqlclient is the one recommended as MySQL backend for Django and is compatible with python3.
https://docs.djangoproject.com/en/2.0/ref/databases/#mysql-db-api-drivers
Mar 11 2018
@bd808 thanks a lot!!
Mar 7 2018
Feb 10 2018
this bug seems to be fairly new, it has probably been introduced by a recent code change. I have just run into it and it is definitely a new behavior.
Jan 28 2018
Jan 26 2018
I think in general if we have a good pipeline to parse citations, a LOT of people would be interested in that, and yes it would be sad to just throw the results away…
It would make sense but I fear this might go against some enwp guidelines. Editors are free to choose between citation formats but citations should be uniform in a given page, so migrating just a few citations in a page is discouraged: http://enwp.org/WP:CITEVAR.
Link to bilbo: https://github.com/OpenEdition/bilbo
Online demo: http://bilbo.openeditionlab.org/