Page MenuHomePhabricator

Expose real page title from Article in mwparserfromhtml
Closed, ResolvedPublic

Description

AFAICT, there is not currently a way to retrieve the actual page title from an Article instance.

article.py has a title property, but this matches the display title rather than the actual page title.
Often, these 2 are the same, but not always, so I would like to also be able to retrieve the actual page title.
An example of such page would be https://en.wikipedia.org/wiki/Conditions_of_My_Parole, where the real page title is Conditions of My Parole, but the display title (which is what is available from article.title) is <i>Conditions of My Parole</i>

The real title *is* available in the raw dumps, but get_metadata explicitly suppresses "name" from being made available as metadata.
Simply making name available from article.metadata should satisfy this task.

Note: also filed on Gitlab

Event Timeline

Hi @matthiasmullie one question here! Is this work for Research (@Isaac) or is this something that can be done on the Structured Data end?

Isaac claimed this task.

Whoops - I can answer. Matthias kindly put together the patch and it's been released!