Page MenuHomePhabricator

deleteOldRevisions.php deletes content text from current revisions
Open, MediumPublicBUG REPORT

Description

Author: slboat

Description:
i run the script,it delete a lot revision,which is cool.

[root@li84-236 w]# php maintenance/deleteOldRevisions.php
Delete old revisions

Searching for active revisions...done.
Searching for inactive revisions...done.
[root@li84-236 w]# php maintenance/deleteOldRevisions.php --delete
Delete old revisions

Searching for active revisions...done.
Searching for inactive revisions...done.
43744 old revisions found.
Deleting...done.
Searching for active text records in revisions table...done.
Searching for active text records in archive table...done.
Searching for inactive text records...done.
39866 inactive items found.
Deleting...done.


Version: 1.22.7
Severity: major

Details

Reference
bz66615

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:10 AM
bzimport set Reference to bz66615.
bzimport added a subscriber: Unknown Object (MLST).

slboat wrote:

but after doing that,i found lot of page become empty content,after dump all the date,i can paster some info like:

  <title>Yum 被锁定</title>
  <ns>0</ns>
  <id>121</id>
  <revision>
    <id>37328</id>
    <parentid>21104</parentid>
    <timestamp>2013-04-11T06:19:52Z</timestamp>
    <contributor>
      <username>Sen</username>
      <id>1</id>
    </contributor>
    <minor/>
    <comment>Sen move[[Yum被锁定]]to[[Yum 被锁定]]</comment>
    <text xml:space="preserve" bytes="1283" />
    <sha1>fv944ixooe=</sha1>
    <model>wikitext</model>
    <format>text/x-wiki</format>
  </revision>
</page>

<page>
  <title>Ubuntu 添加在此打开终端</title>
  <ns>0</ns>
  <id>238</id>
  <revision>
    <id>33477</id>
    <parentid>33476</parentid>
    <timestamp>2013-03-02T17:22:09Z</timestamp>
    <contributor>
      <username>Sen</username>
      <id>1</id>
    </contributor>
    <minor/>
    <comment>Sen移动[[添加在此打开终端]]页面至[[Ubuntu 添加在此打开终端]]</comment>
    <text xml:space="preserve" bytes="1589" />
    <sha1>tozgbw2=</sha1>
    <model>wikitext</model>
    <format>text/x-wiki</format>
  </revision>
</page>

there are bunch of this ....as you can seethe content become a empty

<text xml:space="preserve" bytes="1283" />

and all of this are just the last log is a move info...

Is there any pattern?
Both examples above seem to be page moves / redirects?

On a probably very unrelated note, Firefox browser console lists several errors on the link given in comment 2, for example

  • ReferenceError: rm2d_ki101 is not defined (whatever that weird stuff is)
  • SecurityError: The operation is insecure in MediaWiki:Gadget-ExtDiscus.js

slboat wrote:

Hmm,I use the counter the server and disqus.

I believe the problem is cause by the last log is move,I found there are 345 blank pages

I'd be interested to know where rm2d_ki101 comes from.

JavaScript error console has nothing to do here IMHO

When you move a page, it creates a new revision, but it doesn't create a new row in the text table, so you end having 2 revisions with the same text identifier.

If deleteOldRevisions.php isn't smart enough to take this into consideration, it may end up deleting the contents of the text table just because it's being referenced by one "old" revision, without checking first if it's also being used by a "current" revision.

slboat wrote:

(In reply to Jesús Martínez Novo (Ciencia Al Poder) from comment #6)

JavaScript error console has nothing to do here IMHO

When you move a page, it creates a new revision, but it doesn't create a new
row in the text table, so you end having 2 revisions with the same text
identifier.

If deleteOldRevisions.php isn't smart enough to take this into
consideration, it may end up deleting the contents of the text table just
because it's being referenced by one "old" revision, without checking first
if it's also being used by a "current" revision.

yes!you are correct,i believe that's why,that's a big problem,if some not backup but run this script,the clean page never have chance get back.

i wonder why no one report this question,i think it had been a while.

I'm still pending to have time to test this, but if it can delete content from current pages, it's a major bug

slboat wrote:

(In reply to Jesús Martínez Novo (Ciencia Al Poder) from comment #8)

I'm still pending to have time to test this, but if it can delete content
from current pages, it's a major bug

now i see the great of open source software!

if i can do any help plz let me know!

Okay, I've tested it and I was unable to reproduce.

I've created a page, and did some edits. Then moved the page (leaving a redirect), ran deleteOldRevisions.php and the contents of the page were still there, and all the page history was deleted (CORRECT).

Then, I performed some more page moves, this time without leaving a redirect, and ran again deleteOldRevisions, and it has the same behavior: page history deleted (only last item is preserved) and the page has the last version of the text.

I've tested that on 1.22 and 1.23. I see your wiki now have 1.23. Was this happening on an earlier MediaWiki version?

slboat wrote:

i running it in 1.22 version.that's so wired..

i was dump the wiki before,so here is the not yet run script

<source code = "mediaiwki">

<page>
  <title>Sunny4836</title>
  <ns>0</ns>
  <id>107</id>
  <revision>
    <id>341</id>
    <timestamp>2012-04-14T03:59:01Z</timestamp>
    <contributor>
      <username>Sen</username>
      <id>1</id>
    </contributor>
    <comment>以“== 为何而玩 == 我是做软件开发的,底层方面,对这些东西都比较有兴趣, == 玩家梦想 == 最开始的需求是&lt;br /&gt; 我去大学里,...”为内容创建页面</comment>
    <text xml:space="preserve" bytes="802">== 为何而玩 ==

我是做软件开发的,底层方面,对这些东西都比较有兴趣,

玩家梦想

最开始的需求是&lt;br /&gt;
我去大学里,宿舍是用那个802.1X拔号的&lt;br /&gt;
但是IPAD没这个功能,拔不了&lt;br /&gt;
所以一直想找个能实现这个功能的无线路由&lt;br /&gt;
不过703N买回来,我也没研究到底能不能拔这个&lt;br /&gt;
2012.04.14 刷了OPENWRT,那个小标的ROM,不过我下载错了,是个精简版的,貌似没3G在里面,然后就下了个新的,一刷进去,就起不来了&lt;br /&gt;

折腾经历

2012-04-14 11:20:02 但是用PUTTY打开是乱码,不晓得怎么回事,我原来有USB转COM口的线,我的地线 跟TX RX接到COM口的母头
[[文件:down.jpg]]&lt;br /&gt;

[[Category:Openwrt]]
[[Category:Openwrt玩家]]</text>

  <sha1>ikboc34j49yqqf7uibaqn4lqoczv1</sha1>
  <model>wikitext</model>
  <format>text/x-wiki</format>
</revision>
<revision>
  <id>763</id>
  <parentid>341</parentid>
  <timestamp>2012-04-21T03:30:29Z</timestamp>
  <contributor>
    <username>Sen</username>
    <id>1</id>
  </contributor>
  <text xml:space="preserve" bytes="771">{{Openwrt玩家}}

为何而玩

我是做软件开发的,底层方面,对这些东西都比较有兴趣,

玩家梦想

最开始的需求是&lt;br /&gt;
我去大学里,宿舍是用那个802.1X拔号的&lt;br /&gt;
但是IPAD没这个功能,拔不了&lt;br /&gt;
所以一直想找个能实现这个功能的无线路由&lt;br /&gt;
不过703N买回来,我也没研究到底能不能拔这个&lt;br /&gt;
2012.04.14 刷了OPENWRT,那个小标的ROM,不过我下载错了,是个精简版的,貌似没3G在里面,然后就下了个新的,一刷进去,就起不来了&lt;br /&gt;

折腾经历

2012-04-14 11:20:02 但是用PUTTY打开是乱码,不晓得怎么回事,我原来有USB转COM口的线,我的地线 跟TX RX接到COM口的母头
[[文件:down.jpg]]&lt;br /&gt;</text>

  <sha1>8nogu5ho2775otcxccqtxgi1oszwg</sha1>
  <model>wikitext</model>
  <format>text/x-wiki</format>
</revision>
<revision>
  <id>1643</id>
  <parentid>763</parentid>
  <timestamp>2012-04-30T07:29:48Z</timestamp>
  <contributor>
    <username>Sen</username>
    <id>1</id>
  </contributor>
  <minor/>
  <comment>移动[[Opwrt玩家:sunny4836]]至[[Sunny4836]]</comment>
  <text xml:space="preserve" bytes="771">{{Openwrt玩家}}

为何而玩

我是做软件开发的,底层方面,对这些东西都比较有兴趣,

玩家梦想

最开始的需求是&lt;br /&gt;
我去大学里,宿舍是用那个802.1X拔号的&lt;br /&gt;
但是IPAD没这个功能,拔不了&lt;br /&gt;
所以一直想找个能实现这个功能的无线路由&lt;br /&gt;
不过703N买回来,我也没研究到底能不能拔这个&lt;br /&gt;
2012.04.14 刷了OPENWRT,那个小标的ROM,不过我下载错了,是个精简版的,貌似没3G在里面,然后就下了个新的,一刷进去,就起不来了&lt;br /&gt;

折腾经历

2012-04-14 11:20:02 但是用PUTTY打开是乱码,不晓得怎么回事,我原来有USB转COM口的线,我的地线 跟TX RX接到COM口的母头
[[文件:down.jpg]]&lt;br /&gt;</text>

    <sha1>8nogu5ho2775otcxccqtxgi1oszwg</sha1>
    <model>wikitext</model>
    <format>text/x-wiki</format>
  </revision>
</page>

</sourcecode>

slboat wrote:

and this is the after: only left a comment with

the byte size is right,but just blank page: <text xml:space="preserve" bytes="771" />

<page>
  <title>Sunny4836</title>
  <ns>0</ns>
  <id>107</id>
  <revision>
    <id>1643</id>
    <parentid>763</parentid>
    <timestamp>2012-04-30T07:29:48Z</timestamp>
    <contributor>
      <username>Sen</username>
      <id>1</id>
    </contributor>
    <minor/>
    <comment>移动[[Opwrt玩家:sunny4836]]至[[Sunny4836]]</comment>
    <text xml:space="preserve" bytes="771" />
    <sha1>8nogu5ho2775otcxccqtxgi1oszwg</sha1>
    <model>wikitext</model>
    <format>text/x-wiki</format>
  </revision>
</page>

How was done the last edit to those articles?

I've tested it by deleting the article, and importing it with Special:Import, then ran the script and still no issues.

slboat wrote:

that's very very wired...it could be my server problem?i will try to do a test on my server again,very thank you for the test,that's my first time use this script,but it scared me as you can see,i still have very consider to run those script:)

them are no such tool,to check which the mediawiki has lost content,so even it lost we cant know,only if we see it.

(In reply to sen from comment #14)

them are no such tool,to check which the mediawiki has lost content,so even
it lost we cant know,only if we see it.

You could get a list of revisions with missing text using this database query:

select page_namespace, page_title, rev_id from page join revision on rev_page = page_id where not exists (select * from text where old_id = rev_text_id);

slboat wrote:

thank you!if anything i found i will let you know,so i guess them no more provide to take this as bug yet

Aklapper changed the subtype of this task from "Task" to "Bug Report".Feb 15 2022, 9:39 PM
Aklapper removed a subscriber: wikibugs-l-list.