MediaWiki Action API design discussion: the amazing/good/bad/ugly
Closed, ResolvedPublic

Description

@Anomie has mentioned that he doesn't get much feedback on the MediaWiki action API (the /w/api.php endpoint). This session is intended to discuss this API's design and gather feedback on it. API documentation needs may be discussed as well (common use cases, documentation gaps, discoverability, etc.).

Etherpad: https://etherpad.wikimedia.org/p/WikiDev16-ApiUsability

Fhocutt updated the task description. (Show Details)
Fhocutt raised the priority of this task from to Needs Triage.
Fhocutt added subscribers: Fhocutt, Anomie.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptJan 4 2016, 6:23 PM
Halfak added a subscriber: Halfak.Jan 4 2016, 7:19 PM

Oooh! I'm going to want to talk about ways that you can query the revision table efficiently that are currently not possible with the action API. I'll be there!

Could you link to the etherpad for this unconference? For those of us interested in parallel sessions.

Halfak updated the task description. (Show Details)Jan 4 2016, 7:43 PM
Halfak set Security to None.

excerpt from etherpad:

Action items with owners:
* Fhocutt: suggest API use-case categorization for hackathon
* !Brad: ask Brad/anomie to review code for API modules, and set aside time to deal with resulting comments. Add anomie as a reviewer on an API-related patch, and if he's not looking at it ping him via email/IRC.
* vague, no one is assigned to it: fix up API documentation. Make a list of pages that need fixing?

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!

This task about a WikiDevSummit 2016 session is still open, has no owner, and has no active projects associated.
Does anyone plan to create specific followup tasks if considered appropriate, before closing this task as resolved?

Anomie closed this task as Resolved.Oct 17 2016, 2:40 PM
Anomie claimed this task.

Notes from this session:

1​Session name: MediaWiki Action API design discussion: the amazing/good/bad/ugly
2​Meeting goal: Anomie has been working on the mediawiki API, let's gather ideas
3​Meeting style: Problem-solving(problem discovery?): surveying many possible solutions
4​Phabricator task link: https://phabricator.wikimedia.org/T122818
5
6​Topics for discussion:
7​Use cases
8​Bots/tools/gadgets
9​historical primary use-case
10​need to query content & perform actions
11​action API geared towards information about lots of pages
12​Google: want to get clean wikipedia data. They've written wikitext parser (parse to structured data). Access templates from API. Access templates; contents are still different from what's visible on HTML page. What the user sees is different from the template. Trying to clean templates to unify implementations. Similar to Wikidata's goal: human and machine-readable data.
13​If you access infobox by template vs. html: even the number of infoboxes on the page is different.
14​Broader issue: language agnosticism. Action API for specific installation; RESTBase is a "Cassandra-backed persistent cache layer", with modules.
15​Pain points
16​What is the best way to query infobox information? ...can there be better ways?
17​one problem with infoboxes is that they are written by different people, different inputs and outputs, wikidata is one answer to standardise that
18​See also content format discussions https://phabricator.wikimedia.org/T119022
19​Discoverability of existing features
20​for example it is hard to understand what each API module will give back
21​cirrus is another example, people might not be interested in that
22​automatically generated documentation: https://en.wikipedia.org/w/api.php
23​human-(un)maintained documentation https://www.mediawiki.org/wiki/API:Main_page
24​API sandbox https://en.wikipedia.org/wiki/Special:ApiSandbox
25​currently undergoing a rewrite by anomie
26​modules are hard to categorise and relate to each other (e.g. "if you are doing x on page see also module y")
27​Ctrl-F stopped working with the API redesign
28​all help in a single page https://en.wikipedia.org/w/api.php?action=help&recursivesubmodules=1 (!!!!)
29​The way the XML dumps, the database and the API represent deleted fields is different and poorly documented.
30​Related https://phabricator.wikimedia.org/T114019
31​Inconsistencies between API access and dumps (e.g. bitfields)
32​A lot of the "actions" aren't actually an action. action=query, action=edit makes sense. action=flow doesn't help me flow something "action" has become a top-level categorization
33​YES.
34​Following on from the point about best practices when writing API modules, this is an important part of the code review process (as well as clear documentation)
35​"action" is really which module to ask to
36​Too many ways of doing similar but not identical tasks (e.g. fetching current page text)
37​part of the problem is fragmentation, often the solution is to ask somebody who has come across the same problem
38​Versioning: let's talk about it. Versioning modules. Brad: where possible, add a new parameter instead of versioning. Issues: complexity creep, how to balance?
39​Versioning could help substantially with addressing the inconsistencies between data (API/XML/Database/etc). Without versioning, we can't refactor without breaking things.
40
41​Design features
42​Querying revisions independent of page/user (SELECT * FROM revision WHERE rev_timestamp BETWEEN "2014" and "2015")
43​check out the allrevisions module (https://www.mediawiki.org/wiki/API:Allrevisions)
44​example of discoverability issues
45​Useful: provide a link to the example queries in API Sandbox (in api.php module docs)
46​More caching:
47​Can caching work for sub-modules of the action API?
48​possible, but needs someone willing to work on it. anomie happy to review.
49​restbase being single-page-oriented is easier to cache/purge, action api not so much since it operates on many pages
50​Mobile views API module should work on more than one article at a time. (depends on the MobileFrontend extension)
51​Can we query the API via PHP in mediawiki? Most queries/actions internally directly access the databases.
52​not ATM, going back and change that is a huge amount of work to properly separate things
53​Would the team be interested in someone working on this with them? Yes! "I'd like to review that code." --anomie
54​Can standardize how we access data because there are some nuances in normalization/etc.
55​Standardization on this can provide common language
56​Unified way of accessing page properties
57​[discoverability] Grouping of actions--what goes together? E.g. Cirrus-related could go together so only people who care about it notice it
58​possible GCI/hackathon project; make a place for information to go, maybe on mw.org
59​Grouping of actions would deal with the action=flow issue (mentioned above). Where that action is essentially a group of everything Flow
60
61​General notes
62​Is there a long-term plan for the action API? (Currently work is done ad-hoc)
63​https://www.mediawiki.org/wiki/Requests_for_comment/API_roadmap
64​https://www.mediawiki.org/wiki/API/Architecture_work/Planning
65​bd808's notion of code pioneer/settler/city planner for code (http://blog.gardeviance.org/2015/03/on-pioneers-settlers-town-planners-and.html among others)
66​Is the purpose to avoid dealing with wikitext? No, not really--you can get HTML out of it, but also handle wikitext.
67​API in layers--wikitext, template, other information to allow user parsing?
68​quarry (web interface for db queries) records queries, can be a useful learning too for newcomers. replicate the same for api sandbox?
69​on the same theme, see also jupyterhub on labs to control pywikibot
70
71​Action items with owners:
72​Fhocutt: suggest API use-case categorization for hackathon
73​!Brad: ask Brad/anomie to review code for API modules, and set aside time to deal with resulting comments. Add anomie as a reviewer on an API-related patch, and if he's not looking at it ping him via email/IRC.
74​vague, no one is assigned to it: fix up API documentation. Make a list of pages that need fixing?
75
76​Conversations to have:
77
78
79
80​Attendees:
81​Aaron Halfaker
82​Filippo Giunchedi
83​Darian Fitzpatrick
84​Niklas Laxström
85​Jordan Adler (Google)
86​Bryan Davis
87​Zhicheng Zheng (Google)
88​Yanan Qian (Google)
89​Stas Malyshev
90​Frances Hocutt
91​Sam Smith
92​Joaquin Hernandez
93
94
95​DON’T FORGET: When the meeting is over, copy any relevant notes (especially areas of agreement or disagreement, useful proposals, and action items) into the Phabricator task.
96
97​See https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016/Session_checklist for more details.

Seb35 added a subscriber: Seb35.Nov 28 2016, 7:50 AM
Seb35 removed a subscriber: Seb35.