Page MenuHomePhabricator

Investigate what data can be pulled from the Wikitext and HTML to simplify our API calls
Closed, ResolvedPublic8 Estimated Story Points

Description

When we are fetching the metadata for an article we make a huge actions API call to do it, in order to make our API requests leaner and faster we want to investigate what we can pull off Wikitext and HTML instead of Actions API call.

Acceptance criteria
Tickets with actionable item about how and what we can extract a being created.

Things to consider

  • we can extract Templates
  • we can extract Categories
  • what other things we can exctract?

Parsers to evaluate
https://gitlab.wikimedia.org/repos/research/html-dumps
https://github.com/earwig/mwparserfromhell
https://www.mediawiki.org/wiki/Alternative_parsers

Event Timeline

Protsack.stephan updated the task description. (Show Details)
AnnaMikla changed the task status from Open to In Progress.Oct 6 2022, 12:26 PM
AnnaMikla changed the task status from In Progress to Open.Oct 28 2022, 12:27 PM
AnnaMikla changed the task status from Open to In Progress.
Daria_Kevana changed the task status from In Progress to Open.Nov 6 2022, 11:17 PM