#### Description
[Extension:ImageSuggestions](https://www.mediawiki.org/wiki/Extension:ImageSuggestions) is an extension that will send/display notifications to users when there is a likely suitable image for an unillustrated article on their watchlist.
It's essentially consists out of 3 parts:
1. Data pipelines feed matches (article & relevant images) into HDFS/Cassandra
2. A (weekly) maintenance script queries Cassandra and creates notifications for those matches (only 1 user/image per unillustrated watched article)
3. Echo notification config, to display these notifications etc.
#1 & #2 are async processes; only #3 really runs live on-wiki.
From a user's POV, all they will see is a notification (max. 2 per user per week) that informs them about an article & a potentially relevant image.
#### Preview environment
The notification will be deployed on beta, but we will not be able to reliably generate images like we will in production: the data pipelines don't capture data from beta (and there really isn't any relevant data on beta to work with anyway)
#1 and #2 can't really be tested there.
We can, however, do a dry-run of the maintenance script (which outputs what notifications will be created without creating them)
We will also manually generate some sample notifications on beta to test #3.
#### Which code to review
#2 & #3 will live in the Extension:ImageSuggestions repo: https://gerrit.wikimedia.org/g/mediawiki/extensions/ImageSuggestions. WIP patches: [#2](https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ImageSuggestions/+/778216), [#3](https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ImageSuggestions/+/778215)
#1 is pyspark code running on the stats cluster: https://gitlab.wikimedia.org/repos/generated-data-platform/datapipelines/-/tree/T296814-image-suggestions/image-suggestions
Commits to deploy extension and schedule maintenance script have not yet been started.
#### Performance assessment
- //What work has been done to ensure the best possible performance of the feature?//
Not too much; it's essentially just a standard Echo notification implementation, like many others.
The initial data gathering is done asynchronously on the stats cluster and shouldn't impact site performance.
The maintenance script will only runs once a week; there's a query with a couple of joins, but uses indexes. Will obviously do a dry-run of said script when run for the first time.
- //What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?//
None
- //Are there potential optimisations that haven't been performed yet?//
None
- //Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the Performance Team for advice: [[ mailto:performance-team@wikimedia.org | performance-team@wikimedia.org ]].//
None