==Overview==
When an interwiki import happens, edit history is brought over from one wiki to another. By default, performers of the imported edits appear with an interwiki prefix, like meta>Example. This is documented at https://www.mediawiki.org/wiki/Help:Import.
- See an on-wiki example of an imported history, here.
- See how that example shows up in our core tables data, via this query
- See how that example shows up in Mediawiki History, via this query
(Fyi, we experience the same issue for other cross-operations. The one we found in T425443 relates to users changing user-groups from centralauth – on metawiki – impacting users on possibly all sister projects.)
For the instances of interwikis imports of revisions, the imported revisions show up with usernames presented in a couple of different ways, such as with
- wiki prefix (e.g., en>USERTEXT , de>USERTEXT, meta>USERTEXT , strategywiki>USERTEXT)
- “import” prefix (import>USERTEXT)
- letter prefix (e.g., b>USERTEXT , w>USERTEXT )
- miscellaneous prefix (e.g. regiowiki.at>USERTEXT , *>USERTEXT)
==Implications==
These imported revisions may be inflating anonymous revision counts, as well as revision counts that don't exclude anonymous revisions. Why? These imported revisions all show up in mediawiki_history as event_user_is_anonymous = TRUE
SELECT event_user_is_anonymous, COUNT(*) FROM wmf.mediawiki_history WHERE snapshot = '2026-04' AND event_user_text LIKE '%>%' AND event_entity = 'revision' AND event_type = 'create' GROUP BY event_user_is_anonymous
The imported revisions share the same sha1:
SELECT wiki_db, page_title, page_id, event_timestamp, event_entity, event_type, event_user_text, event_user_is_anonymous, event_user_is_permanent, revision_text_sha1
FROM wmf.mediawiki_history
WHERE snapshot = '2026-04'
AND wiki_db IN ('dewiki', 'enwiki')
AND page_title = 'Battle_for_Dream_Island'
AND event_timestamp < '2026-03-10'
ORDER BY event_timestamp DESC , wiki_db
LIMIT 50Now that temporary accounts have rolled out, temp account revisions that are imported are labeled as event_user_text_is_anonymous = TRUE and event_user_text_is_temporary = FALSE:
SELECT wiki_db, event_entity, event_type, event_user_is_permanent, event_user_is_anonymous , event_user_text FROM wmf.mediawiki_history WHERE snapshot = '2026-04' AND wiki_db = 'enwiki' AND event_user_text LIKE '%>~%' AND event_entity = 'revision' AND event_type = 'create' LIMIT 50
==Impact==
If we look at overall impact, per the following query, dewiki seems to be affected the most, with more than 6 million rows in MWH having a revision done by a user with event_user_text LIKE '%>%’. That's spanning all years.
SELECT wiki_db, event_entity, event_type, COUNT(*) as count FROM wmf.mediawiki_history WHERE snapshot = '2026-04' AND event_user_text LIKE '%>%' AND event_entity = 'revision' AND event_type = 'create' GROUP BY wiki_db, event_entity, event_type ORDER BY count DESC
Here are the top five rows of that query’s output:
| wiki_db | event_entity | event_type | count |
| dewiki | revision | create | 6031772 |
| mlwiki | revision | create | 698182 |
| bhwiki | revision | create | 341711 |
| tewiki | revision | create | 289112 |
| newiki | revision | create | 219307 |
If we look at the monthly impact, we can see e.g. that for January 2026, dewiki was the most impact, with 5,033 rows in MWH having a revision done by a user with event_user_text LIKE '%>%’. That’s spanning all years.
SELECT wiki_db, substr(event_timestamp,1,7) as month, event_entity, event_type, COUNT(*) as count FROM wmf.mediawiki_history WHERE snapshot = '2026-04' AND event_user_text LIKE '%>%' AND event_entity = 'revision' AND event_type = 'create' AND substr(event_timestamp,1,7) = '2026-01' GROUP BY wiki_db, substr(event_timestamp,1,7), event_entity, event_type ORDER BY count DESC
Here are the top five rows of that query’s output:
| wiki_db | month | event_entity | event_type | count |
| dewiki | 2026-01 | revision | create | 5033 |
| tcywiki | 2026-01 | revision | create | 240 |
| tewiki | 2026-01 | revision | create | 102 |
| siwiktionary | 2026-01 | revision | create | 69 |
| mlwiki | 2026-01 | revision | create | 47 |
Impact note: It should be noted that many of these instances may occur on pages that were moved and the original page was deleted; pages where the original page (translated from) gets deleted; or translated pages (i.e. the new page) that subsequently get deleted. This will affect how these edits show up – or don’t show up – in various counts.
==Suggestions==
We might consider
- Adding an event_user_is_cross_wiki field to make this instances explicit, as suggested in T425443
- Whether or not we should exclude instances where event_user_is_cross_wiki = TRUE in downstream tables, including the wikistats tables and analytics tables that have edit counts (e.g. geoeditors monthly, edits hourly, etc.)