We are using the [[ https://huggingface.co/collections/CohereForAI/c4ai-aya-23-664f4cda3fa1a30553b221dc | Aya23 model ]] to generate simple summaries of (sections of) Wikipedia articles. In the initial experiments, we have used Cohere's API endpoint (see this [[ https://public-paws.wmcloud.org/User:MGerlach%20(WMF)/text-simplification/section-gists_v01.ipynb | PAWS-notebook ]] for an example).
In this task, we want to figure out whether we could host the model in LiftWing. There are different versions:
[x] [[ https://huggingface.co/CohereForAI/aya-23-8B | Aya-23-8 ]]
[ ] [[ https://huggingface.co/CohereForAI/aya-23-35B | Aya-23-35-B ]]
Alternatively, we also consider the next generation of this model, Aya-expanse because i) it is supposedly strictly better than the Aya-23, ii) the larger version has a slightly smaller memory-footprint, iii) it supports the same 23 languages as aya-23.
[ ] [[ https://huggingface.co/CohereForAI/aya-expanse-8b | aya-expanse-8b ]]
[ ] [[ https://huggingface.co/CohereForAI/aya-expanse-32b | aya-expanse-32b ]]
Additional notes:
* Scope: The aim of this task is to test whether we can host one of these models as a proof-of-concept and not as a production-ready service. Most likely, if the model can be hosted, it will require additional work around optimization which will be captured in follow-up work.
* Context: This work supports hypothesis WE.3.1.3: If we develop models for remixing content such as a content simplification or summarization that can be hosted and served via our infrastructure (e.g. LiftWing), we will establish the technical direction for work focused on increasing reader retention through new content discovery features.