We'd like to create a stream of events that outputs external links that are being added/removed to Wikipedia (in addition to the metadata associated with the change). What's the best way of doing so?
We'll go with Mediawiki + EventBus extension.
Here are some alternatives:
- Previously EventLogging was used for this purpose, but it didn't fully materalize. Is EventLogging still the right solution?
- Should we use Mediawiki + EventBus extension?
- Should we use Change Propagation?
- Should the RecentChanges event stream be expanded to include links?
- Anything else?
Given that MediaWiki templates/modules generate links, I think the right approach is to parse HTML diffs (as opposed to Wikitext diffs) for links by listening to Parsoid events. Any issues with that? Or a better way of doing it?
Can references added to Wikidata statements be captured too?
- No, but links are being captured. See T214706#4952539 for an example