Page MenuHomePhabricator

Add 'page_is_redirect' field to the mediawiki_history Data Lake tables
Closed, DuplicatePublic


The mediawiki_history data lake tables contain various fields that record the value of something both at the time when the edit event took place, and at snapshot time (e.g. page_namespace vs. page_namespace_latest).

It would be very useful to have the same for the property of the page being a redirect (e.g. right now in the context of T149021 where several people have been working to approximate this by parsing edit summaries, which is not entirely reliable). I.e. add a new field page_is_redirect to the existing page_is_redirect_latest.

Per a brief IRC conversation with @Milimetric, this requires parsing the actual text of the revision, which is already planned but will need quite a bit of effort.