Page MenuHomePhabricator

More metadata in Parsoid output
Closed, ResolvedPublic

Description

It would be good to have access to more page-wide metadata so the Mobile-Content-Service doesn't have to make extra mobileview/ MW API requests, in order of importance:

  • Page id
  • languagecount
  • protection
  • isMainpage
  • thumbnail image URL
  • Wikidata description

Event Timeline

bearND created this task.Nov 21 2015, 12:14 AM
bearND raised the priority of this task from to Needs Triage.
bearND updated the task description. (Show Details)
bearND moved this task to Backlog on the Parsoid board.
bearND added a subscriber: bearND.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 21 2015, 12:14 AM
Arlolra triaged this task as High priority.Jan 19 2016, 8:29 PM
Arlolra added a subscriber: Arlolra.
ssastry moved this task from Backlog to Next Up on the Parsoid board.May 27 2016, 11:19 PM
ssastry added a subscriber: ssastry.EditedAug 31 2016, 9:48 PM

The only other piece of info Parsoid fetches that we could add is page id and whether it is the main page.

protection, languagecount is not provided in the revisions endpoint and we would have to make additional API calls to add them. As for thumbnail image URL, what is that, as in, is that something the API provides or is that some inferred / heuristic value set by looking at the page output?

ssastry moved this task from Next Up to In Progress on the Parsoid board.Aug 31 2016, 9:58 PM

Change 307880 had a related patch set uploaded (by Subramanya Sastry):
WIP: T119265: Add more page-level metadata required by MCS

https://gerrit.wikimedia.org/r/307880

bearND added a comment.EditedSep 12 2016, 2:50 PM

The thumbnail image URL is the information about the lead image for the page. It's provided by the mobileview action.

If you don't get all the values I'd say it's probably not worth for Parsoid to make additional calls. Similarly for MCS, we may not want to change what we're doing if we still need to make additional requests.

ssastry claimed this task.Dec 5 2016, 4:30 PM
bearND added a comment.Dec 6 2016, 4:12 PM

We're also interested in Wikibase_item ID and isDisambiguation.

Change 307880 merged by jenkins-bot:
T119265: Add more page-level metadata that MCS can use

https://gerrit.wikimedia.org/r/307880

Mentioned in SAL (#wikimedia-operations) [2016-12-14T21:27:00Z] <arlolra> Parsoid updated to version 60ee19ac (T119265, T104523, T104662)

ssastry closed this task as Resolved.Apr 24 2017, 9:58 PM

We're also interested in Wikibase_item ID and isDisambiguation.

This information is also not available without making additional requests.

If you don't get all the values I'd say it's probably not worth for Parsoid to make additional calls. Similarly for MCS, we may not want to change what we're doing if we still need to make additional requests.

Based on this, I am going to close this ticket. Please reopen / create a fresh ticket, if anything changes here.