**Problem**
I was under the assumption that the `pageid` was a stable identifier (stable, as in that it remains the same regardless of user action). However, that does not seem to be the case.
If I am a **permissioned** user and have a `pageid` I can request the title:
`/api.php?action=query&format=json&pageids=4&formatversion=2`
```
lang=json
{
"batchcomplete": true,
"query": {
"pages": [
{
"pageid": 4,
"ns": 0,
"title": "Gotham"
}
]
}
}
```
If I delete that page, I get a response that it's missing:
`/api.php?action=query&format=json&pageids=4&formatversion=2`
```
lang=json
{
"batchcomplete": true,
"query": {
"pages": [
{
"pageid": 4,
"missing": true
}
]
}
}
```
While it's in the deleted state, the `pageid` does not exist (even if you have permission to see all of the deleted revisions).
If I //restore// the page, then all of the sudden, it's back again:
`/api.php?action=query&format=json&pageids=4&formatversion=2`
```
lang=json
{
"batchcomplete": true,
"query": {
"pages": [
{
"pageid": 4,
"ns": 0,
"title": "Gotham"
}
]
}
}
```
However, there are many ways in which restoring //may not// result in the same `pageid`
>>! In T183398#4273349, @Anomie wrote:
> The tricky part with trying to track pages across undeletion by the page_id is that you can get some unexpected situations:
>
> * Undeleting a subset of revisions, moving that page elsewhere, then undeleting the rest will assign a new page ID to the second batch even though they're at the old title.
> * Recreating the page at the same title will assign a new ID for the title.
> ** Then undeleting the old revisions will keep that new page ID.
The problem is that we can store the `pageid` or the `title` in a database, but there isn't way to ensure that in the future this refers to the same page. I realize that if you change everything about a page, is it still the same page? I suppose I mean what users consider to be the same page. If a page can be deleted, restored, and moved and still be the same page, then it should have a stable id throughout any of those processes.
**Solution**
We could change our page deletion strategy from a //hard// delete (where the page is removed from the table) to a //soft// delete. This would invovle adding a `page_deleted` column that would either be a nullable datetime of when the page was deleted, or a boolean field that would indicate whether or not the page is deleted. I think the former is better since it gives more information about the page being deleted.
This change would fix the API endpoints as the page would no longer be missing (but perhaps should return that it has been deleted). If a user were to re-create the page, it would recreate with the same id, it's deleted status would be removed (although, all of the existing revisions would continue to be deleted). Effectively, a deleted page is the same as saying //no revisions//.
Alternatively, if we don't want to change the way that page deletion works, we could just abstract this with the API. The API would basically query for pages in the `archive` table. However, this adds a lot of overhead to API endpoints without actually fixing the underlining issue.
**Use Cases**
* {T181570}
**Work Around**
Store the `pageid` and the `title` and assume that one or the other hasn't changed.