It seems huge parts of pywikibot are devoted to handle Wikibase data model which should not be. More importantly It is not usable outside of pywikibot. I already made pywikibot/wikibase and it needs some work.
|Open||None||T108440 Pull out all Wikibase-related parts of pywikibot to pywikibot/wikibase and use it as a submodule|
|Open||None||T247238 Move wikibase stuff to a separate file/module|
- Mentioned In
- T247238: Move wikibase stuff to a separate file/module
T186200: Rewrite Wikibase data model implementation
T70898: Move Wikibase specific code out of page.py and site.py
- Mentioned Here
- T222608: Should Wikidata Integrator and Pywikibot merge?
T102741: Create a Phabricator project for Python library WikibaseDataModel
@jayvdb, It's a high priority because it's a direct dependency on work we want to do next.
It also seems that this might be considered a duplicate of T102741, but it is not a dependency. If T102741 is not addressed in the short term, we'll still need to make progress on this since we have a deadline to meet.
@Halfak, what work do you want to do next? What do you need? What is your deadline? I am struggling to understand what "pywikibot/wikibase" solves as opposed to "pywikibot (with wikibase functionality)". I suspect that you are wanting something different from either of those.
T102741 is waiting on Phab project being created, so perhaps you might nudge whoever can help with that.
A well designed and stable data model layer should be a pre-req for building a new client access layer.
And a new client access layer would look nothing like pywikibot's wikibase, which was mostly designed to fit within the existing wikitext-centric model of pywikibot, and the design decisions throughout mostly only make sense within a pywikibot framework.
@jayvdb, we need to build feature extractors for WikiData in order to predict vandalism. @Ladsgroup's work is poised to make that feature extraction easier. We're not using pywikibot for a lot of reasons -- many of which have to do with its bloat and, the poor fit that the API has to the needs or revscoring's dependency injection framework.
FWIW, we already have the data model for WikiData extracted and we've been iterating its structure. See https://github.com/wikimedia/pywikibot-wikibase
I am strictly against splitting main parts of the framework out of the current project. On bot owner side it is more difficult to combine a working bot, on developers side it is more difficult to check the code when it is distributed on several parts. Mainly the current and further development of wikibase is inside the pywikibot/core framework not inside pywikibot/wikibase.
@Xqt, maybe we could discuss this before you decline the task? I encourage you to consider https://en.wikipedia.org/wiki/Unix_philosophy and https://en.wikipedia.org/wiki/Coupling_(computer_programming).
On bot owner side it is more difficult to combine a working bot
Why is that? pywikibot has dependencies (e.g. mwoauth) so why not add another dependency. Dependencies can be version specific so you can guarantee that code doesn't change until the dependent version does.
developers side it is more difficult to check the code when it is distributed on several parts
Again, I point to other dependencies. I don't think it scales to have the same small set of developers reviewing all changes. Would you link to help me review pull requests to mwoauth? Maybe not because you don't think about the internal workings of MediaWiki's OAuth implementation (or maybe you do. I don't know). But I'll focus on making sure that we practice good (if not great!) software management in mwoauth so it remains a happy dependency. Why not pywikibase?
One more note. As it stands, we have pywikibase as a separate library that a bunch of things depend on (e.g. revscoring and ORES). We'll never depend on pywikibot because it's an awkward monolith and we only really want/need the functionality of pywikibase. If pywikibot implements a parallel installation of pywikibase-like-code, that means that the duplicate functionality will have to be maintained. In the best case scenario, bugs will be fixed in both libraries and features will be implemented twice. In the more likely scenario, the two libraries will become out of sync and work will simply be duplicated and end-user programmers will be confused. I don't see this as a good outcome at all.
As I can see splitting pywikibot into parts never worked in past. There are few examples where this failed:
- spelling is unsupported for 10 years and archived now
- wiktionary is unsupported for 8 years and archived now
- commonsdelinker is unsupported for 5 years
- opencv is unsupported for 5 years and archived now
- pycolorname is unsupported for 5 years and archived now
- wikibase is unsupported for 1 year and never combined with the pywikibot code
- misc was never used and has been deleted months ago
- pywikibot 2.0 patch backporting from master branch lacked for month and was dropped
- i18n is the only external library which is supported due to automatic update from twn
- mwapi is supported and used quite widely
- mwxml is widely used, supported and vastly more performant than pywikibase's XML dump processing utility
- mwbase is currently used by ORES and revscoring so long term support can be expected, I'd recommend its adoption within pywikibot.
- mwparserfromhell has always been separate and is well supported