Page MenuHomePhabricator

Create a Phabricator project for Python library WikibaseDataModel
Closed, DeclinedPublic

Description

There should be a reusable, modern and well maintained Python package for Wikibase data types, data streams, etc.

@JeroenDeDauw has built https://pypi.python.org/pypi/WikibaseDataModel hosted at https://github.com/JeroenDeDauw/WikibaseDataModelPython , which is a good starting point, but needs active maintainers to bring it up to date and ensure it stays in sync with Wikibase capabilities.

Various code style and structure discussions are needed to gain consensus from potential stakeholders in this new Python library.

The new library will need to encompass the existing functionality available in other python Wikibase components, or at least provide clean classes and APIs that other Python Wikibase components can re-use effectively.


Pywikibot has its own classes for the Wikibase data types. e.g. https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/__init__.py#L199 for WbCoordinate and WbTime is below it. There are other feature requests in Pywikibot-Wikidata

Some other python Wikibase components:

here is another one:
https://github.com/asdil12/pywikibase/blob/master/things.py

Some custom property detection
https://github.com/frimelle/wikibase-stuff/blob/master/obsolete-property.py

datatype detection here:
https://github.com/mkroetzsch/wda/blob/1e5b1582344a5405609177a3965ca04e9707488b/includes/epTurtleFileWriter.py#L79

Some good stuff in here:
https://github.com/asciimoo/searx/blob/aac8d3a7bfdd77a5369e52a4ece99b20669a4625/searx/engines/wikidata.py

nasty parsing here:
https://github.com/gnowledge/gstudio/blob/mongokit/gnowsys-ndf/gnowsys_ndf/ndf/management/commands/iterative_script.py#L359

more
https://github.com/WikidataQuality/WikidataQuality/blob/master/external%20validation/wikidata/datatypes.py
https://github.com/Wikidata-lib/PropertySuggester-Python/blob/master/propertysuggester/parser/JsonReader.py
https://github.com/hay/chantek/blob/master/commands/wikidata/entity.py

In addition to the classes, Wikibase datatypes need to have data.

For example pywikibot includes the following in the wikidata family file:

def globes(self, code):
    """Supported globes for Coordinate datatype."""
    return {
        'ariel': 'http://www.wikidata.org/entity/Q3343',
        'callisto': 'http://www.wikidata.org/entity/Q3134',
        'ceres': 'http://www.wikidata.org/entity/Q596',
        'deimos': 'http://www.wikidata.org/entity/Q7548',
        'dione': 'http://www.wikidata.org/entity/Q15040',
        'earth': 'http://www.wikidata.org/entity/Q2',
        'enceladus': 'http://www.wikidata.org/entity/Q3303',
        'europa': 'http://www.wikidata.org/entity/Q3143',
        'ganymede': 'http://www.wikidata.org/entity/Q3169',
        'hyperion': 'http://www.wikidata.org/entity/Q15037',
        'iapetus': 'http://www.wikidata.org/entity/Q17958',
        'io': 'http://www.wikidata.org/entity/Q3123',
        'jupiter': 'http://www.wikidata.org/entity/Q319',
        'mars': 'http://www.wikidata.org/entity/Q111',
        'mercury': 'http://www.wikidata.org/entity/Q308',
        'mimas': 'http://www.wikidata.org/entity/Q15034',
        'miranda': 'http://www.wikidata.org/entity/Q3352',
        'moon': 'http://www.wikidata.org/entity/Q405',
        'oberon': 'http://www.wikidata.org/entity/Q3332',
        'phobos': 'http://www.wikidata.org/entity/Q7547',
        'phoebe': 'http://www.wikidata.org/entity/Q17975',
        'pluto': 'http://www.wikidata.org/entity/Q339',
        'rhea': 'http://www.wikidata.org/entity/Q108419',
        'tethys': 'http://www.wikidata.org/entity/Q15047',
        'titan': 'http://www.wikidata.org/entity/Q2565',
        'titania': 'http://www.wikidata.org/entity/Q3322',
        'triton': 'http://www.wikidata.org/entity/Q3359',
        'umbriel': 'http://www.wikidata.org/entity/Q3338',
        'venus': 'http://www.wikidata.org/entity/Q313',
        'vesta': 'http://www.wikidata.org/entity/Q3030',
    }

Note that these are http , not https. I cant find any task about Wikidata converting to using HTTPS for these URIs, and I dont expect would or should. At least Q2 is baked into the underlying component, so we can rely on it.

https://github.com/DataValues/Geo/blob/master/src/Values/GlobeCoordinateValue.php#L37 - Q2 = Earth as a default only

https://github.com/DataValues/Time/blob/master/src/DataValues/TimeValue.php#L36 Gregorian vs Julian (used frequently)

Event Timeline

jayvdb raised the priority of this task from to High.
jayvdb updated the task description. (Show Details)
jayvdb added subscribers: jayvdb, hashar, JeroenDeDauw and 9 others.

I support the development of a compliant Python implementation of DataValues and am willing to help integrate it with Pywikibot.
I'm unable to put much effort into it though.

Presumably unsurprisingly I support the creation of such libraries :) My position has not changed much since the linked mail from 2013: I am still willing to help but will not pull it on my own.

@JeroenDeDauw, are you happy for your library to be moved into Gerrit to form the basis of the new library ?

bikeshedding names, it looks like https://pypi.python.org/pypi/datavalues and https://pypi.python.org/pypi/wikibase are available.

Question is whether we want to split the datavalues part from the item/property/etc?

I'm fine with moving that to Gerrit. If that is indeed helpful... I did not know Python very well two years ago. Personally I prefer GitHub for hosting the source, however if you want Gerrit, by all means go for it.

Why not use the existing WikibaseDataModel name? Just "wikibase" is to broad. I think it's fine to initially put the datavalues stuff at the same place.

@Qgil, Can we set up this new project with the code managed by Diffusion?

Not sure what "code managed by Diffusion" means, but we currently won't have Diffusion-only projects as we'd like to avoid a divide in tools (some stuff in Gerrit but some stuff in Diffusion only). But projects can be imported/mirrored into Diffusion (@demon should know more).

@Qgil, Can we set up this new project with the code managed by Diffusion?

-1 to Differential if that's what you mean.

Not sure what "code managed by Diffusion" means, but we currently won't have Diffusion-only projects as we'd like to avoid a divide in tools (some stuff in Gerrit but some stuff in Diffusion only). But projects can be imported/mirrored into Diffusion (@demon should know more).

@Aklapper , some projects use github instead of Gerrit, so I dont see why we cant have some projects make the leap to a Phab managed / Arcanist code repo. This would be a good candidate for a proof of concept, as it is a small project which will need an intense period of design and development (a few months) , and then it will only need an occasional patch when features are added to Wikibase, or an occasional feature added to the existing data types but these will need a lot of code review/discussion to confirm there is broad consensus the feature should be in this central data model library.

@Qgil, Can we set up this new project with the code managed by Diffusion?

-1 to Differential if that's what you mean.

90% of the code is already written, though of course more unit tests will be needed. 90% of this project is discussion to establish a scope, style and structure that we all find acceptable.

I have no objections to you guys using Differential

My concern with starting this project in Wikimedia Gerrit is that it is going to be shut down soon and not migrated. I want to keep all our code review discussions. So if Phab cant be the code manager, maybe it is better for it to stay on Github, or maybe use http://gerrithub.io/ and hope they dont close their doors without an export capability provided.

My concern with starting this project in Wikimedia Gerrit is that it is going to be shut down soon and not migrated. I want to keep all our code review discussions. So if Phab cant be the code manager, maybe it is better for it to stay on Github, or maybe use http://gerrithub.io/ and hope they dont close their doors without an export capability provided.

Soon? I don't see anyone using Differential.
Gerrit is stable and under our own control, unlike GitHub/GerritHub.

Gerrit is stable and under our own control, unlike GitHub/GerritHub.

That is very relative. I have more control over GitHub repos I create than over any of "my" gerrit repos. Furthermore, in case of GitHub I do not have to worry about plans of migrating to some totally different tool as is the case for Gerrit... so far for stability.

@JeroenDeDauw do data values support human-readable diffs?

Human readable diffing is a presentation concern, which is not something you want to put in value objects.

Human readable diffing is a presentation concern, which is not something you want to put in value objects.

I meant returning differences between value objects in a structured format that can later be converted to HTML etc.

The Python objects I created have no such functionality, but then they are very incomplete to begin with. The PHP ones have a toArray method, so you could do a array_diff between those, though that is a rather crude approach.

I suspect it is better to not have diffing functionality in such objects. However if that is what whoever starts working on this want to do, then by all means go ahead, and we can see if any real problems occur.

I'm not sure what's exactly requested here when it comes to creating new projects in Phabricator so I'm removing the Project-Admins tag for the time being. If there is consensus and a clear request, feel free to re-add.

jayvdb renamed this task from Python library for Wikibase data values to Create a Phabricator project for Python library WikibaseDataModel.Aug 23 2015, 3:41 AM
jayvdb updated the task description. (Show Details)
jayvdb added a project: Project-Admins.

Regarding the "create a Phabricator project part" here, https://www.mediawiki.org/wiki/Extension:Wikibase_DataModel lists @JeroenDeDauw as maintainer, lists https://github.com/wmde/WikibaseDataModel/ as the project page, and seems to use https://github.com/wmde/WikibaseDataModel/issues as its current issue tracker.

I'd avoid having two separate issue trackers for one codebase if the maintainer(s) do not explicitly agree on the request.

Regarding the "create a Phabricator project part" here, https://www.mediawiki.org/wiki/Extension:Wikibase_DataModel lists @JeroenDeDauw as maintainer, lists https://github.com/wmde/WikibaseDataModel/ as the project page, and seems to use https://github.com/wmde/WikibaseDataModel/issues as its current issue tracker.

I'd avoid having two separate issue trackers for one codebase if the maintainer(s) do not explicitly agree on the request.

This request is for the Python version of it, which is (essentially) unmaintained:

https://github.com/JeroenDeDauw/WikibaseDataModelPython/issues

This request is for the Python version of it, which is (essentially) unmaintained:
https://github.com/JeroenDeDauw/WikibaseDataModelPython/issues

Would like to hear a statement from @JeroenDeDauw about the maintenance status before creating this.

@Aklapper all fine from my side - I'd like for someone to create such a library and the unfinished experimental code I wrote some years back can be ignored.

Is this still a request to create a project? If so, please add the required data (name, type, description), otherwise please remove the Project-Admins tag. Thanks :)

Luke081515 lowered the priority of this task from High to Low.May 3 2016, 9:51 PM
Danny_B changed the task status from Open to Stalled.May 11 2016, 4:07 AM
Danny_B subscribed.

Unknown current status of the request... -> Stalled

No reply to T102741#2057094 hence declining. Please reopen if still wanted and provide needed information.