Page MenuHomePhabricator

Extend Special:Wantedpages to allow users to search for keywords
Open, LowestPublicFeature

Description

Author: xmlizer

Description:
we cannot for the moment looking for some words in wanted pages.
It will be a good option to activate (search among wanted pages)


Version: unspecified
Severity: enhancement

Details

Reference
bz6

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 6:42 PM
bzimport set Reference to bz6.
bzimport added a subscriber: Unknown Object (MLST).

I'm not sure what's being requested exactly; search is meant to find pages that
*do* exist, while Wantedpages lists only pages that *don't* exist.

xmlizer wrote:

It means it's a feature like looking among "redirect"
You look for a word that exists in the title of a wanted page (it give more
precision that simply looking for the word in a page, especially for masked link
[[A|B]] and give lesser false positive)
In such a case, you find the exact match of what presently exists as a potential
link
Then you just have to click to create the article if you have information
It's more a tool for contributers than for simple "visitors"

May be a feature to add
(like the size of the page for page that do exists) is the number of link (like
in wanted pages) to show the popularity of the link

Okay, I see. It would probably make the most sense for it to be a limiting query option on Special:Wantedpages, like the name input on
Special:Imagelist. Since this is pretty distinct from the main search, I'm reassigning the module from Search to Special Pages.

It's not that many, and the cached Wantedpages stores the top 1000 so you can scroll past them easily. I don't think this is
much of a problem.

[At least some of them are simply due to erroneous links that haven't been cleared from the tables properly and could probably
be removed by a regeneration...]

Last comment belongs on bug 221.

xmlizer wrote:

enhance priority
I think it's a good tool for contributers.
The specialwantedpages gives only those who are mentionned more than X times (I
don't remember the value)
So a link which appear only once could never be catched
Moreover, it's a tool to catch bad links because we could search for special
char (",", "(", "/", entities) easily without having special status (sysop or
so) or special needs to install (mysql, etc.)

wikitech-l-54398 wrote:

Implementation

This patch(against HEAD) should implement the feature. I do not have commit
access to put it into CVS so someone who does please commit it.

attachment wantedpages.patch ignored as obsolete

xmlizer wrote:

does this patch work ?

jeluf wrote:

The patch does a full table scan of 2.2 Million lines for en. It would kill the
servers.
Removed "patch" keyword since the patch is not feasible.

timo wrote:

Would a solution be to generate an offline index and search trough the index,
instead of doing a real-time full table scan?

robchur wrote:

At present, Special:Wantedpages and several other query pages are (at least on
large sites like Wikipedia and Wikia, and wherever anybody else with half a
brain cell and a relatively high server load are concerned) cached in batches on
a periodic basis.

Other requests, such as namespace filtering, aren't always feasible to
implement, because there's no guarantee you'll get *any* results for a given
namespace cached...and some query pages are limited to the main namespace anyway.

In this case, though, we could *probably* add some kind of filter on the titles
without too much bother, although it would still require quite a bit of poking
of the QueryPage class, among other things.

xmlizer wrote:

It has to been resolved
Please don't close bugs without explanation

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

Special:WantedPages is disabled on large sites like Wikipedia and Wikia. Smaller Wikis aren't going to find much use for this feature.

Wantedpages is wanted on small and large sites, and this is a valid enhancement request. The reasons for wantedpages being disabled on big projects are also separately raised bugs (e.g. bug 14786) so it does not make sense to mark this as WONTFIX when the cited reason is a problem which is still OPEN.

Is Andrew Miller's patch really unfeasible if we only enable such a filter provided miser mode is off? Or is the entire point to get this happening on some Wikimedia site some day once bug 14786 is fixed (having this type of filter on Wikipedia is unlikely to happen ever imho)

Just a thought, can Cirrus Search help here in any way? I'm asking without knowing the technical details, only because Cirrus Search is probably the only new factor in the potential solution of this report since the last comment, made in 2011.

I think so. How badly do we want this fixed? How much brain power is it worth?

(In reply to Nik Everett from comment #18)

I think so. How badly do we want this fixed? How much brain power is it
worth?

These are good questions for Dan. :)

My reason to kick this report is that it is the oldest open non-tracking bug we have. Bug number 6, filed almost a decade ago.

(In reply to Quim Gil from comment #17)

Just a thought, can Cirrus Search help here in any way? I'm asking without
knowing the technical details, only because Cirrus Search is probably the
only new factor in the potential solution of this report since the last
comment, made in 2011.

Not easily, methinks. None of these special page reports have any tie into search backends.

(In reply to Nik Everett from comment #18)

I think so. How badly do we want this fixed? How much brain power is it
worth?

I don't think it's all that important. See comment 1 from Brion:

(In reply to Brion Vibber from comment #1)

I'm not sure what's being requested exactly; search is meant to find pages
that
*do* exist, while Wantedpages lists only pages that *don't* exist.

Also,

(In reply to Quim Gil from comment #19)

My reason to kick this report is that it is the oldest open non-tracking bug
we have. Bug number 6, filed almost a decade ago.

Maybe it's a sign we should WONTFIX it? Or just continue to ignore. It's not hurting anything.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM