Page MenuHomePhabricator

MediaWiki Action API design discussion: the amazing/good/bad/ugly
Closed, ResolvedPublic

Description

@Anomie has mentioned that he doesn't get much feedback on the MediaWiki action API (the /w/api.php endpoint). This session is intended to discuss this API's design and gather feedback on it. API documentation needs may be discussed as well (common use cases, documentation gaps, discoverability, etc.).

Etherpad: https://etherpad.wikimedia.org/p/WikiDev16-ApiUsability

Event Timeline

Fhocutt raised the priority of this task from to Needs Triage.
Fhocutt updated the task description. (Show Details)
Fhocutt added subscribers: Fhocutt, Anomie.

Oooh! I'm going to want to talk about ways that you can query the revision table efficiently that are currently not possible with the action API. I'll be there!

Could you link to the etherpad for this unconference? For those of us interested in parallel sessions.

Halfak set Security to None.

excerpt from etherpad:

Action items with owners:
* Fhocutt: suggest API use-case categorization for hackathon
* !Brad: ask Brad/anomie to review code for API modules, and set aside time to deal with resulting comments. Add anomie as a reviewer on an API-related patch, and if he's not looking at it ping him via email/IRC.
* vague, no one is assigned to it: fix up API documentation. Make a list of pages that need fixing?

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!

This task about a WikiDevSummit 2016 session is still open, has no owner, and has no active projects associated.
Does anyone plan to create specific followup tasks if considered appropriate, before closing this task as resolved?

Anomie claimed this task.

Notes from this session:

1Session name: MediaWiki Action API design discussion: the amazing/good/bad/ugly
2Meeting goal: Anomie has been working on the mediawiki API, let's gather ideas
3Meeting style: Problem-solving(problem discovery?): surveying many possible solutions
4Phabricator task link: https://phabricator.wikimedia.org/T122818
5
6Topics for discussion:
7Use cases
8Bots/tools/gadgets
9historical primary use-case
10need to query content & perform actions
11action API geared towards information about lots of pages
12Google: want to get clean wikipedia data. They've written wikitext parser (parse to structured data). Access templates from API. Access templates; contents are still different from what's visible on HTML page. What the user sees is different from the template. Trying to clean templates to unify implementations. Similar to Wikidata's goal: human and machine-readable data.
13If you access infobox by template vs. html: even the number of infoboxes on the page is different.
14Broader issue: language agnosticism. Action API for specific installation; RESTBase is a "Cassandra-backed persistent cache layer", with modules.
15Pain points
16What is the best way to query infobox information? ...can there be better ways?
17one problem with infoboxes is that they are written by different people, different inputs and outputs, wikidata is one answer to standardise that
18See also content format discussions https://phabricator.wikimedia.org/T119022
19Discoverability of existing features
20for example it is hard to understand what each API module will give back
21cirrus is another example, people might not be interested in that
22automatically generated documentation: https://en.wikipedia.org/w/api.php
23human-(un)maintained documentation https://www.mediawiki.org/wiki/API:Main_page
24API sandbox https://en.wikipedia.org/wiki/Special:ApiSandbox
25currently undergoing a rewrite by anomie
26modules are hard to categorise and relate to each other (e.g. "if you are doing x on page see also module y")
27Ctrl-F stopped working with the API redesign
28all help in a single page https://en.wikipedia.org/w/api.php?action=help&recursivesubmodules=1 (!!!!)
29The way the XML dumps, the database and the API represent deleted fields is different and poorly documented.
30Related https://phabricator.wikimedia.org/T114019
31Inconsistencies between API access and dumps (e.g. bitfields)
32A lot of the "actions" aren't actually an action. action=query, action=edit makes sense. action=flow doesn't help me flow something "action" has become a top-level categorization
33YES.
34Following on from the point about best practices when writing API modules, this is an important part of the code review process (as well as clear documentation)
35"action" is really which module to ask to
36Too many ways of doing similar but not identical tasks (e.g. fetching current page text)
37part of the problem is fragmentation, often the solution is to ask somebody who has come across the same problem
38Versioning: let's talk about it. Versioning modules. Brad: where possible, add a new parameter instead of versioning. Issues: complexity creep, how to balance?
39Versioning could help substantially with addressing the inconsistencies between data (API/XML/Database/etc). Without versioning, we can't refactor without breaking things.
40
41Design features
42Querying revisions independent of page/user (SELECT * FROM revision WHERE rev_timestamp BETWEEN "2014" and "2015")
43check out the allrevisions module (https://www.mediawiki.org/wiki/API:Allrevisions)
44example of discoverability issues
45Useful: provide a link to the example queries in API Sandbox (in api.php module docs)
46More caching:
47Can caching work for sub-modules of the action API?
48possible, but needs someone willing to work on it. anomie happy to review.
49restbase being single-page-oriented is easier to cache/purge, action api not so much since it operates on many pages
50Mobile views API module should work on more than one article at a time. (depends on the MobileFrontend extension)
51Can we query the API via PHP in mediawiki? Most queries/actions internally directly access the databases.
52not ATM, going back and change that is a huge amount of work to properly separate things
53Would the team be interested in someone working on this with them? Yes! "I'd like to review that code." --anomie
54Can standardize how we access data because there are some nuances in normalization/etc.
55Standardization on this can provide common language
56Unified way of accessing page properties
57[discoverability] Grouping of actions--what goes together? E.g. Cirrus-related could go together so only people who care about it notice it
58possible GCI/hackathon project; make a place for information to go, maybe on mw.org
59Grouping of actions would deal with the action=flow issue (mentioned above). Where that action is essentially a group of everything Flow
60
61General notes
62Is there a long-term plan for the action API? (Currently work is done ad-hoc)
63https://www.mediawiki.org/wiki/Requests_for_comment/API_roadmap
64https://www.mediawiki.org/wiki/API/Architecture_work/Planning
65bd808's notion of code pioneer/settler/city planner for code (http://blog.gardeviance.org/2015/03/on-pioneers-settlers-town-planners-and.html among others)
66Is the purpose to avoid dealing with wikitext? No, not really--you can get HTML out of it, but also handle wikitext.
67API in layers--wikitext, template, other information to allow user parsing?
68quarry (web interface for db queries) records queries, can be a useful learning too for newcomers. replicate the same for api sandbox?
69on the same theme, see also jupyterhub on labs to control pywikibot
70
71Action items with owners:
72Fhocutt: suggest API use-case categorization for hackathon
73!Brad: ask Brad/anomie to review code for API modules, and set aside time to deal with resulting comments. Add anomie as a reviewer on an API-related patch, and if he's not looking at it ping him via email/IRC.
74vague, no one is assigned to it: fix up API documentation. Make a list of pages that need fixing?
75
76Conversations to have:
77
78
79
80Attendees:
81Aaron Halfaker
82Filippo Giunchedi
83Darian Fitzpatrick
84Niklas Laxström
85Jordan Adler (Google)
86Bryan Davis
87Zhicheng Zheng (Google)
88Yanan Qian (Google)
89Stas Malyshev
90Frances Hocutt
91Sam Smith
92Joaquin Hernandez
93
94
95DON’T FORGET: When the meeting is over, copy any relevant notes (especially areas of agreement or disagreement, useful proposals, and action items) into the Phabricator task.
96
97See https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016/Session_checklist for more details.