Page MenuHomePhabricator

Meeting: Developers of python libraries for MediaWiki
Closed, ResolvedPublic


There are a lot of Python libraries for doing work with MediaWiki. It turns out that there's at least 10 for interacting with the API! Let's meet up to talk about what's working and what's not. Let's also talk about opportunities where we could consolidate effort, share code, work together, etc.

Event Timeline

Halfak created this task.May 3 2015, 7:23 PM
Halfak raised the priority of this task from to Needs Triage.
Halfak updated the task description. (Show Details)
Halfak moved this task to Meeting proposals on the Wikimedia-Hackathon-2015 board.
Halfak added a subscriber: Halfak.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 3 2015, 7:23 PM
yuvipanda added a subscriber: yuvipanda.
Harej awarded a token.May 3 2015, 7:25 PM
Harej added a subscriber: Harej.
Legoktm renamed this task from Developers of python libraries for mediawiki to Developers of python libraries for MediaWiki meetup at 2015 Lyon hackathon.May 3 2015, 7:26 PM
Legoktm updated the task description. (Show Details)
Legoktm set Security to None.
Legoktm added a subscriber: Legoktm.
Ricordisamoa added a subscriber: Ricordisamoa.
hashar added a subscriber: hashar.May 6 2015, 9:50 AM
Qgil added a subscriber: Qgil.May 7 2015, 2:08 PM

Are you still planning to run this session? Would you prefer to have it scheduled in advanced (i.e. to promote it in our circles)?

Halfak added a comment.May 7 2015, 2:22 PM

@Qgil yes. I'd like this session to take place. Is there something you feel that we have missed? Maybe there's a specific place that you think we should promote this but we haven't yet?

Interested mainly in regard to Flow. pywikibot support is in progress as a GSOC (T67119), would be glad to reach out to the other library developers if applicable.

Qgil added a comment.May 18 2015, 11:12 AM

It is time to promote Wikimedia-Hackathon-2015 activities in the program (training sessions and meetings) and main wiki page (hacking projects and other ongoing activities). Follow the instructions, please. If you have questions, about this message, ask here.

This will be at 17:00 in the Croix Rousse room (right after the action API session).

Halfak renamed this task from Developers of python libraries for MediaWiki meetup at 2015 Lyon hackathon to Meeting: Developers of python libraries for MediaWiki.May 23 2015, 7:41 AM
Halfak updated the task description. (Show Details)May 23 2015, 3:11 PM
Halfak updated the task description. (Show Details)

This is actually at WP2

Notes pasted at P675

1People: jayvdb, legoktm, yurik, Pierre Selim, Jean-Fred, Lyon Epitech students (Antoine and Dimitri), valhallasw, multichill, halfak, ladsgroup, とある白い猫, hashar, yuri, Yuvi
2Title: Meeting: Developers of python libraries for MediaWiki
5== Round of introductions ==
6We all hate compat
9Existing Python libraries:
11API libraries
12 See
13 See
14pywikibot-core -- BOT OPERATING SYSTEM
16 -- API library and BOT OPERATING SYSTEM
17mwapi -- Basic API library
18mwclient -- Basic API library
19wikitools -- API library that implements API Structure
20 -- General collection of utilities for extracting mediawiki data (research focus). API, DB, XML & utilites for extracting sessions, reverts and title parsing
22 (thin python layer for api.php query parameters)
24TL;DR: there are too many of them!
27 -- General mediawiki OAuth utility
28 -- Flask mediawiki OAuth routes
30Machine learning / Artificial Intelligence
31 -- Machine learning and feature extraction system that focuses on the revision
32 (ORES) -- Restful web service
33 -- Handcoding utility (Restful server & gadget)
34 -- A machine learning system for predicting WP 1.0 assessment classes
36Live systems support
37 -- A generalize event datasource (reads DB, API, RCStream and [IRC])
39Data extraction utilities
40 (mwcites)
41 -- Extracts <ref> tags from XML dumps (both current and historically)
42 -- Extracts basic behavioral stats for new editors using the MediaWiki DB
43 -- A hadoop stream processing framework for extracting information from XML dumps
44 -- A content persistence extraction system that uses the MediaWiki API
45 -- Parses WikiText into abstract syntax trees
47== Discussion ==
48* halfak: figure out what libraries are out there and try and work together
49* yurik: pywikibot's Page object is a kitchen sink, Site should be used to interact with the API
50* yurik: library should use requests, and do very simple things, separation of storage objects and "Site" object
51* legoktm: pywikibot is heavy, it needs
53We could publish pywikibot to pypi if we wanted to do. Would need a stable release that we could maintain.
54Yurik: asking about splitting pywikibot to a thin/lower level and the heavy one
55Jay: yeah we are actually doing it
56valhallasw: can do it already! You can do this already.
57Jay: that is the Site object, named methods correspond to the API requests
58Yurik: makes it hard to catch up with upstream changes
59valhallasw: there is a network layer, a simple API layer, and then a layer on top of that (Site)
60jay: ...explore api programmatically via action=paraminfo
61hashar: lacking docs on how to use lower level parts of pywikibot, made it easier for me to write what I needed instead of reusing existing code.
62halfak: smallest library we can possibly use
63Jean-Fred: pywikibot was a hassle to install a lot like in venvs (using tox) for testing − install should be made easier (pip install pywikibot ??)
64jay: problem with the list of wiki famillies growing.
65- one library that loads interwiki.cdb, builds interwiki matrix
66- i18n is in its own package since it keeps being updated over and over (ex: namespaces, edit comments)
67halfak: how so that is an utility! not a library!! A pain in the ass is all the things meant for bots
70ACTION: define the layers of pywikibot and where we draw the limit.
72Segments of pywikibot
73- interwiki.cdb (families)
74- i18n data
75- pywikibot (Recursive?)
76- scripts
77- API
79valhallasw: you can use pip install right now, also by pointing it to the git repo
80jay: some dependencies are just optionals
82legoktm: agreement to split the framework in well isolated libraries.
83valhallasw: someone who want to run a script would want to download a tarball that has all the dependencies included.
84ToAruShiroiNeku: everything through pip, tarballs for people who want them - not necesarrily with any programming knowledge (just to run scripts perhaps)
85hashar: can use wheels maybe? :-D
87yurik: one reference implementation per language to set standard for low level libraries, well known, commonly
94* API Auto Documentation for the low layers of pywikibot
95** Jay, Marteen, Antoine, Amir, legoktm
96** AGREED to use Sphinx and .rst
97** AGREED Publish it do
98*** English up to date docs first, then look into how to maybe (??) localize them
100* Second documentation work is to write specs/RFC/architecture/design documentation
101* AGREED For now we agree that we won't do i18n on generated documentation which is geared toward devs
102* AGREEDScripts we keep the (i18ned) documentation on
103* ACTION: define user groups and their doc requirements
105Expected outcome: easier for developers to reuse the code potentially as a library.
108low level API library "mwapi"
109* mwapi is a bad name - has to be python specific
110** AGREED to bikeshed about renaming on mailling list later on. One potential proposal: pymwapi
111* AGREED minimum dependencies e.g. "requests"
112* AGREED no hardcoding of any API names (exceptions: login)
113* AGREED Supports Login and Session (non-stored but accessible)
114* AGREED no automatic badtoken handling (middleware should handle that). Ie a badtoken should raise an exception and would not attempt to relogin automatically
115* AGREED Supports all api value types, e.g. timestamps, list -> "str|str|str"
116* AGREED Does not handle errors. Only reports them wrapped in an APIError
117* AGREED Configurable retries for HTTP errors and exponential back-off? Left up to the user who constructs the session object.
118* AGREED Query / continuation (uses new style continuation, does not handle errors during continuation)
119* maxlag <-- Proposal?
120* Discussion
121** Can we repurpose the name "mwapi"? Who owns that utility? YUVI! Yay! -- use this one, its empty, i created it (if you want of course)
122** Can create another repo as needed.
123** Where do we draw the limit between layers?
127Middle level?
128* Handles errors and stuff? I guess?
129* OAUTH ???
132other libraries?
134Revscoring intgration with pywikibot
136We are all crazy.
137+ 1
138mais pas du tout mon bon monsieur!
139Oh mon dieu du francais =P

Qgil closed this task as Resolved.May 24 2015, 8:22 PM
Qgil claimed this task.