Endpoint to make it easier for apps to display talk pages.
The endpoint takes a talk page title (and optional revision id) and returns a structured representation of the talk page, in JSON, preserving only certain elements.
Payload is structured as `replies` within `topics`:
**Topics:**
{F29470254}
Topics correspond to sections:
- `id` comes from the section id
- `depth` corresponds to section depth - i.e. the H tag's number (or 1 if from root level)
- `html` comes from section H tag contents
- `shas.html` is sha on `html` appended to `id`
- `shas.indicator` is sha on this topic's replies shas appended to this topic's `html` (easy way to know if any reply has changed)
**Replies:**
{F29470282}
Replies correspond, as best as can be determined, to individual messages within a topic. The primary challenge is teasing out replies from one another:
- `html` is the body of the reply, the boundaries of which are determined by a combined heuristic of message `depth` and user "signature" detection (user and user talk page links and timestamps are considered for this heuristic)
- `depth` corresponds to the level of indentation of the reply (as indicated by depth of nesting within depth indicating tags - ie `DL`, `UL`, `OL`, or similar depth-indicating wiki markup - ie `:`
- `sha` is sha of this reply's index appended to its `html`
A subset of markup will be preserved - presently `B`, `I`, `A`, `UL`, `OL` and `LI`. Other tags' content is converted to plain text.
Certain tags will be converted to bold - presently `BIG`, `CODE` and `DT`.
Images are converted to links to the image.
-----
The left side of the images below are examples from a complex user talk page.
The right side shows simple ajax output of the WIP endpoint's data (as defined above) for the same part of the page.
Note the indentation of both `topics`, as outlined in blue, and `replies`, as outlined in red, at their correct respective depths. Also make note of replies (red outline) being correctly distinguished from one another:
{F29013191}
Note the preservation of indentation inside a reply, as seen in two places inside the first (outlined in red) reply:
{F29013190}
Tables are obviously not going to look the same when we're only preserving their text, but, aside from the lack of pie, the gist of the message is discernible:
{F29013189}
Topics from H3 (or greater) are correctly set to depth corresponding to the H tag - as seen on the second topic here (outlined in blue):
{F29013188}
`CODE` tag contents are converted to `B` so they stand out. The table is not great, but basically readable.
{F29013187}
The gist of this table is better preserved in text form:
{F29013186}
Both ordered and unordered lists in one topic! Oh my! (note the bug in the 3rd reply signature being pulled up one line... will fix!)
{F29013185}