Page MenuHomePhabricator

Export TOCData in the action API
Closed, ResolvedPublic

Description

Using prop=sections returns the "legacy" sections array. But this has some historical artifacts (T319141) and doesn't include top-level extension data for the TOC.

We should add a new prop=tocdata which returns the JSON-serialized form of the TOCData, with the modernized property names and extension data, etc, so we can eventually deprecate the sections property (and uses of SectionMetadata::toLegacyArray()).

$result_array['tocdata'] = JsonCodec::serializeOne( $p_result->getTOCData() )

would be the logical way to write this, but serializeOne is currently private in JsonCodec. But we want a way to get the "array form" of a serializable object, recursively expanding its values as needed, so that we can then give it to the Api framework to encode as JSON or XML or whatever.

Event Timeline

Change #1201328 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] Deprecate action=parse&prop=sections; introduce prop=tocdata as replacement

https://gerrit.wikimedia.org/r/1201328

Change #1201328 merged by jenkins-bot:

[mediawiki/core@master] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1201328

Change #1203159 had a related patch set uploaded (by Jforrester; author: C. Scott Ananian):

[mediawiki/core@REL1_45] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1203159

Change #1203163 had a related patch set uploaded (by Jforrester; author: Jforrester):

[mediawiki/core@REL1_44] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1203163

Change #1203164 had a related patch set uploaded (by Jforrester; author: Jforrester):

[mediawiki/core@REL1_43] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1203164

Change #1203163 merged by jenkins-bot:

[mediawiki/core@REL1_44] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1203163

Change #1203164 merged by jenkins-bot:

[mediawiki/core@REL1_43] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1203164

Change #1203159 merged by jenkins-bot:

[mediawiki/core@REL1_45] ApiParse: Introduce prop=tocdata as replacement for prop=sections

https://gerrit.wikimedia.org/r/1203159

MSantos triaged this task as Medium priority.Fri, Nov 21, 10:26 AM

You have a bug. In this API query, parsing a page with a whitespace-only section heading, the output from prop=sections correctly has line and anchor set to the empty string while the output from prop=tocdata incorrectly omits these properties entirely.

"sections": [
    {
        "toclevel": 1,
        "level": "2",
        "line": "",
        "number": "1",
        "index": "1",
        "fromtitle": "API",
        "byteoffset": 0,
        "anchor": "",
        "linkAnchor": ""
    }
],
"tocdata": {
    "sections": [
        {
            "tocLevel": 1,
            "hLevel": 2,
            "number": "1",
            "index": "1",
            "fromTitle": "API",
            "codepointOffset": 0
        }
    ],
    "extensionData": []
}

The linkAnchor also seems to be omitted in general, but that's defensible as a removal of redundancy in the new format since it's generally the same as anchor (unless % characters are involved).

Each field in the TOCData representation has a default value if it is not present; the default value for 'line' and 'anchor' is the empty string. The default value for 'linkAnchor' is 'anchor'. https://github.com/wikimedia/mediawiki-services-parsoid/blob/08d22d0c01e135ddc850fcbeb50def29904cfe4d/src/Core/SectionMetadata.php#L340

This output is correct, but this behavior needs to be documented: T410979: Document (and/or tweak) TOCData API representation.