Currently, ParseWiki does not support parsing MediaWiki tables ({| ... |}). This task aims to extend the parser to support table syntax and include tables in the generated parse tree.
Requirements:
- Detect and parse table syntax:
- Table start {| and end |}
- Row separator |-
- Header cells !
- Regular cells |
- Attributes and classes (e.g., class="wikitable")
- Add unit tests for various table cases (simple, nested, with attributes, etc.).
- Ensure the code follows the ParseWiki project’s current structure and style.
Resources:
- MediaWiki table syntax: https://www.mediawiki.org/wiki/Help\:Tables
- For reference: Parsoid’s handling of tables
Impact:
Supporting tables will significantly improve the accuracy and completeness of ParseWiki output, especially for use cases involving data extraction or wiki content analysis.