Page MenuHomePhabricator

Separating infoboxes and navboxes from article content
Open, Needs TriagePublic

Description

An overview of this topic was presented toward the end of a session at Wikimania 2015.

We now have a number of different proposals for revamping navboxes and infoboxes:

The goal of this task is to separate out infoboxes and navboxes from article content, and to provide fluent editors for them. This would allow mobile (for example) to more easily tweak the presentation of infoboxes, without having to parse the article content, recognize them, and pull them out. A desirable side-effect is to make it more routine to source data and label texts for infoboxes/navboxes from wikidata, so that they can be shared among wikis in different languages. An extensible design would allow a local wiki to inherit most of an infobox, but still override, delete, or add fields, to empower local editors and account for political sensitivities.

A key component could be T107595: [RFC] Multi-Content Revisions, which has proposed a new content model for articles in core which would allow storage of infoboxes/navboxes separate from the article content.

SUMMIT PLAN

  • Leave with a unified plan for the future of infoboxes and navboxes, including buy-in from wikidata, the mediawiki-core team (for storage separate from article content), mobile (to replace their current display hacks), editing (to create the necessary tools), and analytics (to ensure that infoboxes remain easy to edit, but not significantly easier to vandalize).

This card tracks a proposal from the 2015 Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey

This proposal received 49 support votes, and was ranked #16 out of 107 proposals. https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Wikidata#Make_it_easy_to_build_infoboxes_that_display_information_from_wikidata

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
cscott raised the priority of this task from to Needs Triage.Sep 17 2015, 10:09 PM
cscott updated the task description. (Show Details)
cscott added subscribers: cscott, brion.

Great topic for discussion, @cscott! Do you think this is an exhaustive list of alternatives, or is this "the list"?

The fastest way to make a change will be if we have a finite list of viable alternatives, and then use the meeting as an opportunity to hash out our collective favorite. However, that's not realistic without a lot of upfront work to get everyone to agree on the composition of the list. Do you have an intermediate goal for narrowing down our options to a finite list to choose from?

cscott set Security to None.

These are all the alternatives I know of, and I talked to a lot of people about this at Wikimania. But there may be more, there are three-ish months from now until the summit for folks to add anything I missed.

I suspect the collective decision will be to proceed with a hybrid of the "best of" these various alternatives. For example, Capiunto is functional, but I think it wants tweaks to wikidata, and we'd like it not to be embedded in the article content. And perhaps we can learn from wikia's implementation something about the wikidata content model required. So the goal would be not just to identify an already-existing winner, but to describe the remaining work to be done and then to marshall the resources needed to get the work done.

I'm of the opinion that this isn't an engineering problem, it's a community buy-in problem.

As you can see from the task description, we've got multiple implementations for Wikidata-backed infoboxes in the wild. There are working infoboxes on production wikis which are completely Wikidata-backed (e.g. https://fr.wikipedia.org/wiki/Mod%C3%A8le:Infobox_Fromage). But yet, they've not seen widespread adoption. Why is this? Why are people not using them? What do people think is missing? What do we need to do next to support wider rollout?

I'd like to see any session around be entirely focussed instead on answering the above questions. The engineering solution can come later.

Well, I think there are significant engineering issues as well. But I agree that there is an opportunity to use the dev summit to get everyone on the same page.

cscott updated the task description. (Show Details)

I'm of the opinion that this isn't an engineering problem, it's a community buy-in problem.

As you can see from the task description, we've got multiple implementations for Wikidata-backed infoboxes in the wild. There are working infoboxes on production wikis which are completely Wikidata-backed (e.g. https://fr.wikipedia.org/wiki/Mod%C3%A8le:Infobox_Fromage). But yet, they've not seen widespread adoption. Why is this? Why are people not using them? What do people think is missing? What do we need to do next to support wider rollout?

I'd like to see any session around be entirely focussed instead on answering the above questions. The engineering solution can come later.

+1000 This!
Anything in this direction has the potential to cause a huge backlash. If this is done wrong this can cause Wikidata adoption on the Wikipedias to be set back to 0 for a long time. This needs to be first carefully considered from the product-perspective and I need to be involved in this.
Dan: We need to have a discussion about this when you're in Berlin.

I believe the distinction between "engineering solutions" and "community
use" is a false one. The point of this summit topic would be to learn from
everyone---community, existing implementation experience, analytics, etc
etc---so that we can (a) come up with engineering solutions to problems
with existing implementations, and (b) ensure that we build the Right
Thing, the thing that the community wants and needs. We shouldn't exclude
technical talk, any more than we should exclude community input. We need
to consider both together.

But the summit is the wrong venue for editor community input.

I'll defer to @RobLa on this, since every significant topic proposed for the summit ought to incorporate community input. I believe the plan is to actively reach out to representatives from affected communities, although "template authors" was the specific example discussed.

I am in agreement with @Deskana here and @Lydia_Pintscher ''s comment about potential backlash. If nothing, the proposal title should be retitled -- the "kill infoboxes and navboxes" phrase should go -- because that is not the intention.

Well, I hope we don't need to scrub the fun out of our session titles. Removing infoboxes from content while they "long live" (as separate items associated with the article, not commingled with the content) is indeed exactly what is proposed.

OK, a little braindump.

What people want; and by "people" I mean "experienced and inexperienced Wikipedians who edit in English and other languages":

  • Not to go to Wikidata to edit data. Even if Wikidata has nicely-designed forms and internationalization. Practically nobody I talk to about this wants to leave the home wiki. The nicely-designed form must appear in the home wiki, in one way or another, and of course it must be localizable and seamlessly integrated. Yes, it's a big thing, but everybody says it.
  • Not to lose control over the local design of the infoboxes.
  • Not to lose the ability to override the data locally. As just one example, at least some wikipedians who write in Hebrew won't appreciate if Wikidata doesn't show Jerusalem as the capital of Israel in the country infobox at least in the Hebrew Wikipedia, and there are plenty of other examples. (Yes, in ideal world infoboxes in all languages will show all POVs neutrally, and it will be backed by common linked data. I'm OK with getting to an ideal world in stages.)
  • As an important positive point, people definitely do want to invest as little time as possible in translating infoboxes when they translate articles. I am certainly biased on this point as the ContentTranslation PM, but in all honesty, this is one of ContentTranslation's most-requested features, and rightly so. For better or worse, it's not quite right for ContentTranslation to do it, where as something like Wikidata, done correctly, sounds far more suitable.
cscott renamed this task from A plan to kill infoboxes and navboxes from article content (long live infoboxes and navboxes)! to Separating infoboxes and navboxes from article content.Sep 21 2015, 10:01 PM
cscott updated the task description. (Show Details)

This task seems to be a mix of separating *boxes from article source (ie. they should not be stored in the main wikitext content but in a separate content chunk or entirely autogenerated or whatever) and separating them from rendered output (ie. they should not output raw HTML but some kind of structured data, which might then be turned into HTML but is also available directly to apps or alternative skins).

I think that's confusing as they are largely independent (separating in source would help with separation in output but is not a necessity) and have different hardships. Separating in source would be a huge workflow change so that's more of an UX problem than an engineering problem (or rather, it should only become an engineering problem once the UX problem is solved). Also the case for it is weaker, IMO - most of the benefits are associated with the output-level separation, which should be an uncontroversial architectural change that would only affect template maintainers.

See T114251 which possibly overlaps with this.

Congratulations! This is one of the 52 proposals that made it through the first deadline of the Wikimedia-Developer-Summit-2016 selection process. Please pay attention to the next one: > By 6 Nov 2015, all Summit proposals must have active discussions and a Summit plan documented in the description. Proposals not reaching this critical mass can continue at their own path out of the Summit.

This is definitely an engineering problem, but one that needs research into how different projects handle their infoboxes, navboxes, and even related variants. The question is how it can be made to work across projects with different needs (and what those needs are), with or without wikidata, still easily editable...

Even within a single project I can see a lot of different ways these would be used just depending on the part of the project, but could it be standardised? How far could skins go with these? Would it behave like a ToC or thumb placed within the content, really much as before?

Generating infoboxes from wikidata is the subject of two different proposals in the 2015 Community Wishlist Survey.

Hey, @RobLa-WMF -- I noticed this is scheduled for an 80 minute slot starting 11:30AM on the first day of the summit -- opposite an open slot in the larger Robertson 1. What's the expected format of this session? Should I start by presenting an overview of the various options to set the stage?

@cscott, thanks for pinging me, and yes, your proposal to introduce and then open it up sounds perfect, thanks! Think "lightning talk" for your intro. 5 minutes is ideal, 10 minutes for something more complicated, everyone has a right to be annoyed if you take 15 minutes, and we cut off your microphone at 20 minutes. :-) That make sense?

@hoo Are you attending the summit? Can you talk a little bit about T114251?

@GWicke Can you talk a bit about Content Widgets?

@daniel Will you be there, can you talk about T107595?

@hoo Are you attending the summit? Can you talk a little bit about T114251?

I planned to talk about it, but due to various communication problems that never went anywhere, so I pulled it out of the summit at some point. Nevertheless, I will still attend the summit.

I guess the RfC is going to go through the normal process then or maybe at the Hackathon or during Wikimania or something like that. Although I don't really know how much I will be able to work on that next year (from WMDE), I still hope to be able to work on that within 2016.

So one of my goals here is cross-fertilization. We've got a bunch of folks who have worked on various flavors of "better infobox". I'd like to make sure we get a chance to talk shop and say what we liked/didn't liked about our various approaches, and see if we can't come up with a better mousetrap (or agree that someone's particular mousetrap is already the best).

@RobLa-WMF suggests that this is important from the perspective of the "Next Generation Content Loading and Routing" track as well, since we're proposing to decouple infoboxes from article content in various ways.

@hoo do you think you could pull together 2-5 slides for a "lightning talk" on your work? Maybe pull from your [Wikimedia talk https://wikimania2015.wikimedia.org/wiki/Submissions/Making_Infoboxes_easier_to_edit_and_maintain_with_the_help_of_Wikidata] and from T114251: [RFC] Magic Infobox implementation. Not a whole wikimania-length talk, just enough to outline your basic approach and get the conversation started.

Same request applies to @GWicke, @daniel and @Jdlrobson (and anyone else who has work on infoboxes they'd like to *briefly* present).

@hoo do you think you could pull together 2-5 slides for a "lightning talk" on your work? Maybe pull from your [Wikimedia talk https://wikimania2015.wikimedia.org/wiki/Submissions/Making_Infoboxes_easier_to_edit_and_maintain_with_the_help_of_Wikidata] and from T114251: [RFC] Magic Infobox implementation. Not a whole wikimania-length talk, just enough to outline your basic approach and get the conversation started.

Will do.

I put together a few slides for an overview at the start of the session: https://docs.google.com/presentation/d/1pFUIoC0rQioUBYEs5DQPVJvi9WdCXFEUebUdlo1v0RE/edit?usp=sharing
Hopefully @hoo can describe his stuff at the end, and maybe if @daniel is present he can say a few words about how T107595: [RFC] Multi-Content Revisions might be relevant.

Etherpad of the discussion at MWDS 2016: https://etherpad.wikimedia.org/p/WikiDev16-T112987

Copy:

Session name: Separating infoboxes and navboxes from article content
Meeting goal: "unified plan for the future of infoboxes and navboxes (overly optimistic)"
Meeting style: Scope Narrowing
Phabricator task link: https://phabricator.wikimedia.org/T112987

Topics for discussion:
        * Should infoboxes be stored inline, or separately from the wikitext? (See also Multi-Content revisions RFC)
        * Daniel: migrating already existing infoboxes problematic
        * Implementation other's than hoo's

General notes
Everybody knows navboxes and infoboxes. Used and abused.
There are several technical solutions, e.g. Wikia and WMF's Mobile apps. Hoo has another plan.

Wikia: New markup, xml-based.
Mobile: They turned it off because it was too easy to edit them.
Gabriel: "content widgets", syntax for direc queries from Wikidata


C. Scott's suggestion: 
    
    
Marius (hoo) - "Magic infoboxes" implemention:
    
    capiunto - small extension for Scribunto that helps make creating / formatting infoboxes easier (as they exist now)
    decouple infoboxes from page content:

    * separate schema definition

    * separate data defintion (can be local and/or from wikidata)

    * lua modules for formatting (e.g. capiunto)

CScott: this (consistent formatting) makes scraping easier

Daniel: parameters still would be wikitext

Trevor: note templates containing subtemplates or runs of multiple templates which open/continue/close a single transclusion
JamesF: E.g. https://en.wikipedia.org/wiki/Template:Infobox_ship_begin#Special_capabilities (contains 4 subtemplates including the flags)
marktraceur (EPL comment): I think ^ would present more problems than just the subtemplates, based on it being a "begin" template, and therefore probably an unbalanced DOM fragment
quiddity (EPL comment): or https://en.wikipedia.org/wiki/Template:Infobox_currency which uses {{native name}}, {{flag}}, and {{url}}.

DJ: how to handle converting units (e.g.)?

Daniel: there would be formatter for dates, units, etc. and lua can be used to help deal with specific types of information (e.g. timezones)

CScott: have toolbox in lua for most stuff

Trevor: Why even use DOM?

CScott: Because I haven't finished [..?..] yet.

Jiang (etherpad): many infobox are not as simple as key-value, e.g.:
    https://en.wikipedia.org/wiki/Michael_Jordan
    https://en.wikipedia.org/wiki/Carl_Lewis
    

TheDJ: infoboxes don't just contain simple key-value pairs. E.g.
https://en.wikipedia.org/wiki/Monterrey_Institute_of_Technology_and_Higher_Education 
https://en.wikipedia.org/wiki/McKail,_Western_Australia (a list of suburbs around a town)

Tim: Module:Infobox is used on 2.5 million pages in en.wikipedia

CScott - topics to discuss:

    * any implementation other than hoo's?

    * migration issue


not all data / properties are represneted on wikidata, but there still is option of having local data for such cases

Use the ~100 colors in WikiData? http://tools.wmflabs.org/wikidata-todo/tree.html?q=1075&rp=279&method=list

CScott: One goal is to make data easier to read for machines, but another important goal is to make the data more accessible to humans, in particular in "small" wikis.

TheDJ: Could we do it similarly to Commons metadata migration?

Potential action items

    * "Dumb" migration, like commons metadata

    * Migration toolkit

    * Survey/classification of existing templates (survey quality and distribution)

    2.5 million + infoboxes, but how many *types* of infoboxes

    * https://en.wikipedia.org/wiki/Wikipedia:List_of_infoboxes

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!

IMPORTANT: If you are a community developer interested in working on this task: The Wikimedia Hackathon 2016 (Jerusalem, March 31 - April 3) focuses on #Community-Wishlist-Survey projects. There is some budget for sponsoring volunteer developers. THE DEADLINE TO REQUEST TRAVEL SPONSORSHIP IS TODAY, JANUARY 21. Exceptions can be made for developers focusing on Community Wishlist projects until the end of Sunday 24, but not beyond. If you or someone you know is interested, please REGISTER NOW.

There was some discussion at the DevSummit in Jan 2016 (see above). Afterwards this task went dormant.

T112987#1915536 offered potential action items; T112987#1915536 are the DevSummit Etherpad notes and also lists potential action items.

@cscott: Is there general consensus on those action items? (T112987#1661396 implied that scope is unclear. T112987#1795019 stated this needs more research.) Who could break them down into subtasks? (I don't see any "blocked by"/subtasks in this task.)

Aklapper removed a subscriber: RobLa-WMF.

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)

Izno updated the task description. (Show Details)