This is a tracking task for our project to improvement the current article recommendation system in the Android app.
- We want to test a new recommendation algorithm, namely Citolytics. Citolytics would be an additional source for the read more feature (related pages project page).
- In contrast to MoreLikeThis which is currently in use, Citolytics finds related Wikipedia pages based on links instead of the article text. With its different algorithmic approach, we hope to increase the user engagement by providing better recommendations.
- The evaluation framework can be used as general blueprint for future optimizations of the related pages feature.
In order to test the recommendations we must modify or access mainly three components of the Wikipedia system (see sub-tasks for details):
- MediaWiki/CirrusSearch: The article recommendations can be integrated with a custom KeywordFeature, e.g. CitolyticsKeywordFeature, that is trigged by the citolytics: prefix and retrieves the Citolytics recommendations. The citolytics: prefix then can be accessed via the API. T143197
- Android app: Citolytics needs to be added as additional read more source by allowing besides morelike: also citolytics: as query prefix.
- EventLogging: We want to use the tracking data from the Android app to evaluate the recommendation system and therefore we need to access aggregate data from the EventLogging system.
- A MediaWiki setup that demonstrates the feature based on simplewiki can be found here: http://citolytics-demo.wmflabs.org/
- A guide for setting up the demo can be found here: https://github.com/mschwarzer/citolytics-demo/
- The Android app demo can be used as soon as the API changes are live.
Goal Visibility & Success Metrics
- The Citolytics recommendations will be available in the Android app. See T148833.
- The Android app's event logging data should be also used to evaluate the success. See T149682
- Deployment - How can we perform a test run and collect data to evaluate the recommendation performance?
- Evaluation data - What are the requirements for publishing data from the EventLogging system? The findings of this experiment should be as transparent as possible.