- Determine which table format is used
- Extract information for a specific table format
|Resolved||None||T103701 [WLM] Epic: Possibility for direct image upload for (German) WLM via monument lists in Wikipedia with a predefined category|
|Declined||Tobi_WMDE_SW||T105280 [WLM] Write bot to update lists|
|Declined||None||T105881 [WLM] Update-bot: parsing the table format|
After running an analysis bot, that returned over 200 possible table formats, I'd recommend to split the functionality of parsing tables and converting them into agreed-upon templates (one for each county) into a separate bot. This bot could also create lists of tables that can't be parsed. This bot is probably out of scope for this sprint and maybe out of scope for WLM 2015.
The column headings of unique ID of monuments have a great variation. I've thrown together a regex that matches all the different "id-like" column names:
However, sometimes the id is not page-unique, sometimes fields have to be combined, it's definitely a nontrivial task.
Using templates is the way forward and will make the lists more semantic and useful, so I've declined this.
In the future, we could suggest writing a limited-scope update bot or checker-bot equivalent for the communities that still use tables.