Page MenuHomePhabricator

Attempt to convert wikitext talk pages into structured conversations.
Closed, DeclinedPublic

Description

When Flow imports a wikitext talk page it is possible to attempt parsing that into a set of topics (based on sections), and splitting even further into comments (by signatures).

We can imagine many edge cases where this can run into issues, such as when user A makes a list, and user B replys in-line to a particular element of the list. We arn't sure how prevalent these are though.

It seems plausible that a significant number of wikitext talk pages might be able to be cleanly converted, especially less trafficked talk pages that only have a couple of sections. There is already a place in the Flow conversion process to report this information, we just have not written anything to attempt this.

I'm thinking we should write something that works at a very basic level as a first attempt. It would read only the current version of the talk page, split it into sections, and the split by signatures into individual attributable comments. Some sort of score should be generated representing our confidence in this being a complete conversion. If this looks to produce reasonable results with proper conversions on some pages we can make it a little more robust by starting with the first edit to a talk page and stepping through the revisions, but this gets more complicated due to archiving and such.

The main goal is not to convert all wikitext talk pages to Flow, many are just too complex and need to be archived, but to have a process that can attempt the conversion, and then we convert only pages with a confidence interval > some pre determined level.

Event Timeline

EBernhardson raised the priority of this task from to Needs Triage.
EBernhardson updated the task description. (Show Details)
EBernhardson subscribed.
EBernhardson set Security to None.

It is time to promote Wikimedia-Hackathon-2015 activities in the program (training sessions and meetings) and main wiki page (hacking projects and other ongoing activities). Follow the instructions, please. If you have questions, about this message, ask here.

Was there any attempt in Lyon? Will there be an attempt in Mexico City?

I don't think that anyone worked on this at the Lyon hackathon. It's a perfect idea for a hackathon project, and if anybody outside the Collab team wants to take it on for Mexico City, I'm sure the Collab developers would be happy to offer assistance.

Prior to the re-org, @EBernhardson was planning to work on this at the hackathon. I'm not sure if he did or not.

I didn't end up working on this at the hackathon. Most of my hackathon time was spent with the Discovery team new hire David. We poked around a variety of search related things but not this task.

Pppery subscribed.

Obsolete now that we have DiscussionTools and Flow is superseded.