Deleted items contaminating search results
Closed, ResolvedPublic

Assigned To
None
Priority
Normal
Author
MER-C
Subscribers
MER-C, wikibugs-l
Projects
Reference
bz13792
Description

On opening the above URL (you might have to reload the page a few times to get the text to appear - this might be another bug) you'll find that the last five results, being:

[[Image:3 accelerators 17-59. PILCURE-MBT,MBTS,ZMBT,F,CBS,NS,MOR,DCBS,TMT,ZDMC,ZDC,ZDBC,SDBC,ZDBzC-.pdf]]

[[Image:Minnesota Educational Computing Consortium Quick Reference Guide for BASIC Language Version 3.1 MECC TIMESHARE SYSTEM Rev. 2 slash 78.pdf]]

[[Image:Partial unilateral ureteropelvic obstruction in neonatal pigs - Effect of acute inhibition of angiotensin II AT1-receptors on GFR and sodium handling.pdf]]

[[Image:Strategy for improving genetic aspects of fertility and hatchability in breeding lines of White Leghorns, and choosing hens for second cycle of production.pdf]]

[[Image:Buddha's teachings in a NUTSHELL(Explains why Buddha did not answer Questions pertaining to eternal God, NON-SOUL theory (anatta) and his basic teachings.pdf]]

... as well as various others, are deleted and have been so for quite a long time. Deleted items should not appear in search results.

The user interface search - url: http://en.wikipedia.org/w/index.php?title=Special:Search&limit=500&offset=7000&ns6=1&redirs=1&search=.pdf (once again, you might have to reload multiple times) - is somewhat better behaved as in it doesn't show these deleted items but when you view the page source you find comments such as:

<!-- missing page Image:3 accelerators 17-59. PILCURE-MBT,MBTS,ZMBT,F,CBS,NS,MOR,DCBS,TMT,ZDMC,ZDC,ZDBC,SDBC,ZDBzC-.pdf-->
<!-- missing page Image:Minnesota Educational Computing Consortium Quick Reference Guide for BASIC Language Version 3.1 MECC TIMESHARE SYSTEM Rev. 2 slash 78.pdf-->
<!-- missing page Image:Partial unilateral ureteropelvic obstruction in neonatal pigs - Effect of acute inhibition of angiotensin II AT1-receptors on GFR and sodium handling.pdf-->
<!-- missing page Image:Strategy for improving genetic aspects of fertility and hatchability in breeding lines of White Leghorns, and choosing hens for second cycle of production.pdf-->
<!-- missing page Image:Buddha's teachings in a NUTSHELL(Explains why Buddha did not answer Questions pertaining to eternal God, NON-SOUL theory (anatta) and his basic teachings.pdf-->


Version: unspecified
Severity: normal
URL: http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=.pdf&srnamespace=6&sroffset=7000&srlimit=500

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz13792.
MER-C created this task.Via LegacyApr 19 2008, 1:04 PM
MER-C added a comment.Via ConduitApr 19 2008, 1:09 PM

Stupid long PDF names. The files are:

http://en.wikipedia.org/wiki/Image:3 accelerators 17-59.PILCURE-MBT,MBTS,ZMBT,F,CBS,NS,MOR,DCBS,TMT,ZDMC,ZDC,ZDBC,SDBC,ZDBzC-.pdf
http://en.wikipedia.org/wiki/Image:Minnesota Educational Computing Consortium Quick Reference Guide for BASIC Language Version 3.1 MECC TIMESHARE SYSTEM Rev. 2 slash 78.pdf
http://en.wikipedia.org/wiki/Image:Partial unilateral ureteropelvic obstruction in neonatal pigs - Effect of acute inhibition of angiotensin II AT1-receptors on GFR and sodium handling.pdf
http://en.wikipedia.org/wiki/Image:Strategy for improving genetic aspects of fertility and hatchability in breeding lines of White Leghorns, and choosing hens for second cycle of production.pdf
http://en.wikipedia.org/wiki/Image:Buddha's teachings in a NUTSHELL(Explains why Buddha did not answer Questions pertaining to eternal God, NON-SOUL theory (anatta) and his basic teachings.pdf

MER-C added a comment.Via ConduitApr 19 2008, 1:15 PM

Disregard the above comment, Bugzilla is being annoying. Someone better give the Bugzilla devs a kick: https://bugzilla.mozilla.org/show_bug.cgi?id=40896 .

bzimport added a comment.Via ConduitApr 19 2008, 1:33 PM

rainman wrote:

This has been fixed with r32742, so newly deleted files won't show up in search results. However, since this is an old bug, the search index is full of old entries and needs a rebuild. We will be shortly update the whole search backend and have this fully fixed.

bzimport added a comment.Via ConduitApr 19 2008, 2:29 PM

Bryan.TongMinh wrote:

Sorry for the previous mail, forgot to click the assign option.

Assigning to self.

bzimport added a comment.Via ConduitApr 19 2008, 5:26 PM

Bryan.TongMinh wrote:

Fixed in r33608. Broken titles are now silently skipped in API search results.

Aklapper added a comment.Via ConduitMar 26 2013, 11:24 AM

[Merging "MediaWiki extensions/Lucene Search" into "Wikimedia/lucene-search2", see bug 46542. You can filter bugmail for: search-component-merge-20130326 ]

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.