With T183375, undeletion no longer sets the parent IDs for the undeleted revisions to the "expected" one given by the previous revision by ID (usually, but not always, also the previous revision by timestamp), but instead to the ar_parent_id values. For example, look at the [[https://en.wikipedia.org/w/index.php?title=Ernest_Madu&action=history|history of Ernest Madu on Wikipedia]], and you'll see that [[https://en.wikipedia.org/w/index.php?title=Ernest_Madu&oldid=820864133|revision 820864133]] has parent ID 0 (with an "N" mark in the [[https://en.wikipedia.org/wiki/Special:Contributions/Andrewkazimi|user's contribs]]) instead of [[https://en.wikipedia.org/w/index.php?title=Ernest_Madu&oldid=820863993|820863993]]. But even this alone can still cause problems.
## Flaws with the current interface
Some flaws include:
* Failure to distinguish page IDs
* Broken parent revisions
* "Ancestor" revisions being scattered randomly across different titles and mixed in with unrelated histories, with no easy way to rejoin all the revisions into the history of one single title
** This is the same problem that would occur when manually changing rev_page fields (with the current behavior, deleting A, moving B to A without redirect, undeleting a single revision for A, moving A back to B without redirect, and finally undeleting the rest of the revisions for A is effectively equivalent to simply changing the rev_page field for the single revision from A's page ID to B's page ID).
* Takes forever to load for pages with thousands of deleted revisions
## Proposal
Overhaul the undelete feature to make it completely flawless by making pages more like files on your computer with the following steps. This would also make selective undeletion and undeleting revisions under existing pages' histories things of the past. Revision deletion should be used in lieu of selective undeletion.
# Create a `pagearchive` table with a migration script as I have suggested at T161671#4199220.
# Create a script that fills in null pa_page_id (formerly ar_page_id) fields, similar to what T182678 did for ar_rev_id.
# Create a script that fixes duplicate pa_page_id fields, as well as those that duplicate an existing page's ID, similar to what T193180 did for ar_rev_id and rev_id.
# Add an option to the populateParentId script named "--fix-existing" or something similar that, if used, will also update the rev_parent_id field for all existing revisions that already have a parent ID.
# Make the populateParentId script also fill in null ar_parent_id fields in the archive table (thereby completing the modernization of legacy rows), and if the "--fix-existing" option is used, update the ar_parent_id field for all deleted revisions that already have a parent ID.
# Change the undelete feature (the PageArchive class) to accept a single title and a single page ID rather than an array (or list) of timestamps, in order to force "everything" to be preserved, including rev_page and rev_parent_id.
# Remove the $overrides parameter from the newRevisionFromArchiveRow function in the RevisionStore class, which should no longer be needed.
# Add an $unsuppress parameter to the newRevisionFromArchiveRow function that will be used when a suppressor restores a suppressed page with the "Remove restrictions on restored revisions" checkbox.
# Make page histories viewable for deleted page IDs by using the "curid" parameter.
# Make deleted revisions and their diffs viewable by using the "oldid" parameter (T20104).
# Make Special:DeletedContributions share the features of Special:Contributions (e.g. displaying size differences using ar_parent_id and ar_len, as well as "N" for deleted revisions with zero ar_parent_id).
# Change the Special:Undelete interface to display radio buttons with "View history" links for each deleted page ID rather than checkboxes for each deleted revision. The radio button corresponding to the page ID for the ultimate latest deleted revision will be selected by default.
# When there is no existing page having the same title as the one you are trying to undelete, make choosing another title for the undeleted page optional. In this case, there will also be a checkbox (checked by default) for leaving a redirect at the original title. For existing pages, the "View or restore # deleted edits" link will still appear when viewing the history, but choosing another title will become mandatory. In this case, the existing page will be temporarily deleted so that the other page can be undeleted and moved to the chosen title without redirect. After that, the temporarily deleted page will immediately be undeleted.
# Restrict the import feature by only allowing imports to existing page titles if the revisions being imported are either all later than the page's current revision, all earlier than the page's first revision, or all fit between 2 consecutive revisions in the page's history. In the latter 2 cases, the first revision following the imported revisions will automatically have the rev_parent_id field changed to the ID of the latest imported revision. For any other import, the importer must choose another page title, and manually redirect that title to the original page title if the page already exists.
# Add a special page named "Special:SplitHistory" that allows an administrator to easily split the history of a page at a certain point. When splitting out the first n revisions in the history of page A, the new title B will take A's original page ID and page A will get a new page ID. The first revision that stays at page A will automatically have the rev_parent_id field changed to zero. When splitting out the last n revisions in the history of page A, page A will keep its original page ID and the new title B will get a new page ID. The page_latest field will automatically be updated for page A, and the first revision that gets moved to page B will automatically have the rev_parent_id field changed to zero.
# Add a special page named "Special:MergeAndMove" that allows an administrator to simultaneously merge the history of a page B into an older page A and move A to B (with or without redirect). This means that the rev_page field for each revision in the history of B will be changed to A's page ID and the page_latest field for page A updated before moving A to B, and will only be allowed when page A has not been edited since the creation of page B. The first revision originally in the history of B will automatically have the rev_parent_id field changed to the same value as the original page_latest field for page A.
# The "MergeHistory" feature will continue to exist. However, merging A into B will only be allowed if at least one existing revision will remain in the history of A after the merge; otherwise, "Special:MergeAndMove" must be used instead. Also, the first remaining revision in the history of A will automatically have the rev_parent_id field changed to zero, while the first revision in the history of B following the merged revisions will automatically have the rev_parent_id field changed to the ID of the latest merged revision.
# Finally, in both Special:SplitHistory and Special:MergeHistory, revision IDs will be used to distinguish revisions having the same timestamp, so this would also solve T39465 and T183501.
With this proposal, page histories would be kept as simple as possible, while also limiting the recalculation of size differences to one or two revisions at a time.
## Problems to be solved
The above proposal will solve all of the following problems:
* Size differences being incorrect or outdated (e.g. [[https://en.wikipedia.org/wiki/Template_talk:Db-g1/Archive_1|Template talk:Db-g1/Archive 1]] or [[https://en.wikipedia.org/wiki/Gema_Switzerland|Gema Switzerland]] on Wikipedia; usually caused by imports done in 2015 or earlier or undeletions of revisions deleted in pre-1.5 versions of MediaWiki; see also T38976)
* Revisions with the same timestamp being inseparable
* Broken parent revisions (T186280 and T193211)
* "Contaminated" histories caused by mixing the histories of multiple pages together
## Unchanged behaviors
The following behaviors will not be changed:
* Undeletion will still preserve ar_parent_id as rev_parent_id and display the number of restored revisions in the log entry.
* Undeleting files will still be done by selecting checkboxes.
* Importing will still insert new revisions into the page's history.
* Special:MergeHistory will still update the rev_page field for some revisions in the source page's history.
## Actions that will require using other tools
The following actions will require using tools other than deletion or undeletion:
* Reverting a page to an older revision: Split off the later revisions using Special:SplitHistory, and then delete the target page. Selective undeletion will not make sense anymore.
* Merging deleted revisions to an existing page's history: Move the page to a temporary title (e.g. A (temp) for A) without redirect, then undelete the deleted history of A, use Special:MergeAndMove to merge A's history with A (temp)'s history without redirect, and finally move A (temp) back to A without redirect. If there are several deleted page IDs, start with the latest one and then repeat steps 2 and 3 for each earlier page ID before doing step 4.
## Original proposal
When undeleting, always set each undeleted revision's parent ID to the previous undeleted revision among those revisions that had the same ar_page_id field (which will usually be the same as the ar_parent_id field), or zero if there is no such revision, with the following exception: if selective undeletion had been used to effectively revert a page to an older revision and the page had not been edited since the selective undeletion, undeleting later revisions (which would imply a necessary update to the page_latest field in the page table) will leave the parent ID for the earliest undeleted revision as the page's current revision instead of changing the parent ID to zero. This can be done by making a variable that is initialized as an empty array and then changes each time a revision is being inserted. The key is the ar_page_id field and the value is the ID to be used as the parent ID. The exceptional case will not be applied if the page had been edited since the selective undeletion, in order to prevent "siblings" (two revisions with the same parent revision). Each distinct ar_page_id field would then be treated as a separate history (but nonetheless would still appear altogether in a single page's history). There was also an even older proposal at T185167 that I had closed because it had been superseded by the above proposal.