User Story:Improve the parsing of lists in Structured Contents. Current parser struggles with nested lists, lists inside templates/infoboxes, and hybrid structures, leading to incomplete or inaccurate JSON outputs.
**To Do**
- Review existing list parser and document findings
- Refactor parser to correctly handle:
-- Unordered, ordered, and definition lists
As a product team, we would like a feasibility investigation by evaluating the initial list feature and an estimation of the work needed to release lists in beta to Structured Contents (SC On-demand and SC Snapshots).-- Nested / hierarchical lists
-- Lists inside infoboxes
- Normalize list output into consistent JSON
- Validate against representative articles (some examples in PRD)
There is an initial list parser as part of the sections of the Structured Contents parser, but this wasn't part of the metrics framework and official beta release. What would it take to release it?
To Do**Acceptance Criteria**
- Lists are represented as hierarchical JSON structures (no flattening)
- review current list parser (discuss w/ Ruairi)- Lists inside infoboxes are included in structured output
- estimate level- Bug of effort: document known issues and work needed and how to QAmpty lists is fixed
- agree on follow ups w/ product- Known list-related parsing issues from prior feedback are resolved