Endpoint to make it easier for apps to display talk pages.
The endpoint takes a talk page title (and optional revision id) and returns a structured representation of the talk page, in JSON, preserving only certain elements.
Payload is structured as `replies` within `topics`:
**Topics:**
{F29052233}
Topics correspond to sections:
- `id` comes from the section id
- `depth` corresponds to section depth - i.e. the H tag's number (or 1 if from root level)
- `text` comes from section H tag text
- `shas.text` is first 7 chars of a sha on `text`
- `shas.replies` is first 7 chars of a sha on a string which combines this topic's replies shas (easy way to know if any reply has changed)
**Replies:**
{F29012742}
Replies correspond, as best as can be determined, to individual messages within a topic. The primary challenge is teasing out replies from one another:
- `text` comes the message, the boundaries of which are determined by a combined heuristic of message `depth` and user signature
- `depth` corresponds to the level of indentation of the reply (as indicated by depth of nesting within depth indicating tags - ie `DL`, `UL`, `OL`, or similar depth-indicating wiki markup - ie `:`
- `sha` is first 7 chars of sha of `text`
A subset of markup will be preserved - presently `B`, `I`, `A`, `UL`, `OL` and `LI`. Other tags' content is converted to plain text.
Certain tags will be converted to bold - presently `BIG`, `CODE` and `DT`.
-----
The left side of the images below are examples from a complex user talk page.
The right side shows simple ajax output of the WIP endpoint's data (as defined above) for the same part of the page.
Note the indentation of both `topics`, as outlined in blue, and `replies`, as outlined in red, at their correct respective depths. Also make note of replies (red outline) being correctly distinguished from one another:
{F29013191}
Note the preservation of indentation inside a reply, as seen in two places inside the first (outlined in red) reply:
{F29013190}
Tables are obviously not going to look the same when we're only preserving their text, but, aside from the lack of pie, the gist of the message is discernible:
{F29013189}
Topics from H3 (or greater) are correctly set to depth corresponding to the H tag - as seen on the second topic here (outlined in blue):
{F29013188}
`CODE` tag contents are converted to `B` so they stand out. The table is not great, but basically readable.
{F29013187}
The gist of this table is better preserved in text form:
{F29013186}
Both ordered and unordered lists in one topic! Oh my! (note the bug in the 3rd reply signature being pulled up one line... will fix!)
{F29013185}