We have been successfully developing a model to measure the readability of Wikipedia articles (project-page on metawiki). As a next step, we would like to develop a model that could improve the readability of Wikipedia articles (along the lines of text simplification) taking advantage of recent advances in availability and performance of large language models.
As a first step, in this task, we want to scope the project in more detail. Specifically, we would like to get a better overview of potential approaches for implementation.
- Reviewing recent literature
- Reviewing existing models for text summarization and text simplification and comparing with available infrastructure on, e.g., LiftWing, to train/host model
- Reviewing approaches for evaluation
- Identify relevant benchmark/evaluation datasets
- From the above, synthetize a work-plan for implementing and testing an exploratory model for simplification