Populate empty ar_rev_id fields:
- Determine how many rows in archive have ar_rev_id = NULL. Let's call that number m. (e.g. enwiki has 508811 such rows, out of ~87793416 rows)
- Reserve m (or m+k, for good measure) IDs in the revision table:
- Make a note of max( max( rev_id ), max( ar_rev_id ) ), let's call it b.
- Insert a row with rev_id = b+m+k into the revision table, and delete it again, to bump the auto-increment counter.
- For any row in archive that has ar_rev_id = NULL, set ar_rev_id to a unique id between b+1 and b+m+k. This could be done via a temporary table, or programmatically.
Make ar_text and ar_flags unused:
For each row in archive that has a non-null ar_text field, insert a row into the text table, copying ar_text to old_text and ar_flags to old_flags.
- Set ar_text_id to the old_id from the newly created text row.
- Set ar_text and ar_flags to the empty string everywhere.
see https://www.mediawiki.org/wiki/Multi-Content_Revisions/Content_Meta-Data