Automatic category redirects
OpenPublic

Description

Author: p_simoons

Description:
Supposing category:A redirects to category:B.
Would it be feasible to automatically move all articles placed in cat:A into
cat:B instead?
Alternateively, would it be possible to create a Specialpage that lists all
categories that are redirects, so that a bot can do the moving?


Version: unspecified
Severity: enhancement
See Also: T7346

Older changes are hidden. Show older changes.
bzimport added a comment.Via ConduitMar 8 2007, 10:27 PM

fabartus wrote:

Endorse expediency request above with fervency! Moving/remaning would be nice
too, but per
[http://en.wikipedia.org/wiki/Wikipedia_talk:Categories_for_discussion#Propose_tagging_with_both_and_expanding_use_of_Cat_redirects_overall
this] the method of combined soft and hard redirects put together with the
proper linking at the time of saving a page would pretty well cover normal
editing objections to redirecting categories.

CONSIDER: There are multiple ways to phrase the equivalent page classification
in English, but note the three legged stool... most every language's wiki, no
matter what type project, one way or another connects with interwiki's to the
English Wikipedia articles pages (if only through their own article on the topic
in their Wikipedia), the commons category, and/or the Wikipedia category (Which
I labor mightily to synchronize, as much as possible, so I know them well).

How a language alliterates into English translation we native speakers would
find to be an awkward phrasing more often than not, AND VICE VERSA, so on the
commons, redirects of categories are tolerated specifically to cover such
'ambiguities', including redirects of foreign category names to the "Official"
English name (Unlike most namings on the commons, Principal Category names are
English by fiat... articles, images, etc. can all be other languages. So the
importance should be obvious, I hope!).

So too do we native speakers have our choice of how to state a category name...
(e.g. Countries in Europe, vs. Countries of Europe) the halls of the
en.Wikipedia (and I suspect all others!) CFD discussions are ankle deep in blood
from some of those debates! That in part exists because of different schemes in
related categorisation (Geography, Maps, and History all intersect Countries...
so comes complications, or a need to alias however belatedly! <g>) Which frankly
is a needless waste of time, were it easy to alias a name, and that name be the
one 'online' per this proposal.

In one sense these names issues are are trivialities, but they are important
trivialities, as recollection and modes of phrase formulation are inherently
personal things involving the way each of us thinks. So there is a natural
factionalism as others think like me, and some like him, and even when
compromises occasion, they involve a lot of work for someone... which is
hopefully the guys not thinking like me! <BSEG> Bottom line, aliasing would
prevent and eliminate a lot of relatively uselessly wasted man-hours renaming
things. The computers should be doing that drudgery, not we humans, save for you
developers... if you do it once, your effort pays back over and over for the many.

In sum, this matter has a inherently more important and higher priority than
convenience of one editor, but to multiples of the many of many's of editor's
across all nationalities! I guess I'm saying this has had far too low a priority
heretofore. It seems simple, so unimportant, but categories are fundamental to
organising the projects, hence the effects are vastly magnified. In one swell
foop, all the nit-picky (local language) name choices can be trivialized and one
uniform name emerge in each locale&mdash;yet still not only retain, but actually
enhance interconnectivity between sisters and within a given project. So please
do expedite both a determination on Rick's iterim proposal, and a full
implementation allowing name moves and the like.

Just cutting down the debate will free many daily man-hours each day on CFD, so
delay is frankly, costly and world wide costly at that. Best regards // ~~~~

bzimport added a comment.Via ConduitApr 22 2007, 2:12 PM

ayg wrote:

  1. Update categorylinks when a category is edited and becomes a redirect. ...
  2. Update categorylinks when a category is moved.

This might be an issue. I don't know if an UPDATE of, for the current worst case,
a couple hundred thousand rows is acceptable. An alternative would be to check the
redirect status at display time rather than on update, as we do for pages: retrieve
all pages that are in category X or anything that redirects to it. Of course that
means faster UPDATE and slower SELECT, which is generally the reverse of what's
wanted. We should ask Domas or Tim or someone what's best, I guess.

Things which still have to be done:

  1. Update categorylinks when a redirect category becomes a redirection to another category.
  2. Update categorylinks when a redirect category becomes a regular category which is not a redirection. I think that these require another field in categorylinks, "cl_original_to" (may be null, or maybe same to "cl_to" if not redirected?), which specifies the original target (which is now a redirection). If it's not added, it's not possible to update categorylinks because it's not *known* which pages are in this category. I don't know if this field should be added, these things should not be fixed, or there is another way to do it. Any ideas?

Short of reparsing every page in the category, this probably does require an
extra field, yes. Other schema updates might be necessary to make the updates
for large categories efficient. I think that whatever happens will be more
efficient than bots loading tons of pages and forcing them to be reparsed, though.
:)

bzimport added a comment.Via ConduitJun 13 2007, 12:09 AM

robchur wrote:

*** Bug 10236 has been marked as a duplicate of this bug. ***

Catrope added a comment.Via ConduitJun 25 2007, 3:33 PM

Created attachment 3823
Include redirected members in category view

The patch submitted earlier kind of scares me. Consider the following scenario:

  1. Page is categorized in Category:A
  2. Category:A becomes a redirect to Category:B
  3. Page is updated accordingly
  4. Category:A becomes a redirect to Category:C
  5. Page is NOT updated accordingly, since it is treated as a member of Category:B.

Someone suggested a new DB field to counter this, but that isn't necessary.

The attached patch fixes this bug in a simpler way, without the problem described above. When you view Category:B, the code will check if any other categories redirect to B. If Category:A redirects to Category:B, both A and B's members will show up when you view Category:B. As usual, double redirects won't work, i.e. if A redirects to B and B redirects to C, Category:C will show B and C's members, but not A's.

The attached patch makes moving categories easy, just remove NS_CATEGORY from the forbidden namespaces. Since everything is handled transparently through redirects (just like we do with normal pages), no problems should ensue.

Attached: category-redir.patch

bzimport added a comment.Via ConduitJun 25 2007, 3:40 PM

lowzl wrote:

A solution was proposed using the redirects table in bug 8685 ...

Catrope added a comment.Via ConduitJun 25 2007, 3:45 PM

Created attachment 3824
Include redirected members in API list=categorymembers

This patch does the same thing as my previous patch, with the difference that this one fixes listing category members in the MediaWiki API as opposed to the category page itself.

Attached: category-redir-api.patch

bzimport added a comment.Via ConduitJun 25 2007, 5:10 PM

ayg wrote:

*** Bug 8685 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitJun 25 2007, 5:16 PM

ayg wrote:

I'm hardly an SQL expert, I'm afraid, but any particular reason you added an extra query rather than joining? I doubt it makes much difference, though, performance-wise.

I'm a bit alarmed that the change has to be made separately for the API, rather than both calling a general-purpose public method of something, but I guess that's a separate issue.

I'll take a look at this and hopefully commit something today or tomorrow. Although I notice a few bugs assigned to me that I've totally forgotten about, so let's hope this doesn't become one. :D

Catrope added a comment.Via ConduitJun 25 2007, 5:48 PM

(In reply to comment #24)

A solution was proposed using the redirects table in bug 8685 ...

(In reply to comment #27)

I'm hardly an SQL expert, I'm afraid, but any particular reason you added an
extra query rather than joining? I doubt it makes much difference, though,
performance-wise.

The JOIN suggestion suggested in bug 8685 didn't work (selected the wrong data from the page table), and since I'm not particularly good at writing complex SQL queries either, I decided to do it this way. I think my way may actually be faster, since the latter query is still a regular category lookup which is indexed. A complex JOIN statement wouldn't have the indexing benefit (correct me if I'm wrong).

I'm a bit alarmed that the change has to be made separately for the API, rather
than both calling a general-purpose public method of something, but I guess
that's a separate issue.

This is partly due to the fact the API provides much more information and filtering options than you'll ever need in a regular page. Also, the current code mixes DB code with UI code, which makes a lot of functions unusable for the API. Article.php and EditPage.php are good examples.

I'll take a look at this and hopefully commit something today or tomorrow.
Although I notice a few bugs assigned to me that I've totally forgotten about,
so let's hope this doesn't become one. :D

We're all humans, we all need breaks ;) take your time.

bzimport added a comment.Via ConduitJun 26 2007, 12:35 AM

ayg wrote:

Checking EXPLAIN shows that the query will use a filesort due to replacement of simple equality with a check for IN. I got the same trying the one-query join technique, adjusted to give correct results. Domas will probably kill me if I add a gratuitous filesort to every category, so I (we) will have to ask him why it's filesorting and how to stop it.

(By the way, more easily fixed but also significant, your check for rd_title='title' alone can't use the redirect table's indexes, because the index is on (rd_namespace, rd_title). You need to add 'rd_namespace' => NS_CATEGORY to the conditions for that query to be efficient.)

Catrope added a comment.Via ConduitJun 26 2007, 1:14 PM

(In reply to comment #29)

(By the way, more easily fixed but also significant, your check for
rd_title='title' alone can't use the redirect table's indexes, because the
index is on (rd_namespace, rd_title). You need to add 'rd_namespace' =>
NS_CATEGORY to the conditions for that query to be efficient.)

By all means do so. I know just enough about MySQL to get by, and have no idea how all those optimizations work.

Catrope added a comment.Via ConduitJun 30 2007, 7:54 AM

(In reply to comment #29)

Checking EXPLAIN shows that the query will use a filesort due to replacement of
simple equality with a check for IN. I got the same trying the one-query join
technique, adjusted to give correct results. Domas will probably kill me if I
add a gratuitous filesort to every category, so I (we) will have to ask him why
it's filesorting and how to stop it.

I don't really understand any of that (for instance, my query doesn't use IN AFAIK), but I understand it's a performance problem. How can that be solved?

bzimport added a comment.Via ConduitJul 1 2007, 2:30 AM

ayg wrote:

Your code contains an IN because you have 'cl_to' => $titles as a condition, with $titles an array, and that translates to (cl_to IN ($titles)). I don't know how it can be solved, try asking Domas or someone.

bzimport added a comment.Via ConduitJul 3 2007, 3:08 AM

ayg wrote:

After discussion with Domas, it seems that any attempt to check for redirects in the current schema will *probably* cause a filesort, or at least all the ones suggested did. We probably need a new field, cl_real_to or something, that has the redirect pre-resolved. When adding a category to a page, the actual target would be put in cl_to as now; then if it's a redirect, the redirect target would be put in cl_real_to, otherwise that would be a copy of cl_to (or it would be NULL, depending on which works better). Then cl_real_to would be used for displaying category pages in place of cl_to. Whenever a category is changed to a redirect, or the target of a category redirect is changed, categorylinks would be updated appropriately.

River pointed out that if cl_real_to is an id instead of a title, it will persist across renames of the category. But Rob pointed out that that only works if the category has an associated page. River then suggested a cat_id, which may or may not be going too far for this exercise. We can always stick updates for cl_real_to in the job queue, basically mimicking the current bot-update situation.

Catrope added a comment.Via ConduitJul 3 2007, 9:42 AM

(In reply to comment #33)
I think cl_real_to is the way to go. Queries would be indexed, you'd have something like WHERE cl_to='title' OR cl_real_to='title';. Updating cllinks would be the simple (but potentially massive) query UPDATE categorylinks SET cl_real_to='redirtarget' WHERE cl_to='redirname';.

bzimport added a comment.Via ConduitSep 29 2008, 5:46 PM

Eugene.Zelenko wrote:

*** Bug 15742 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitNov 5 2008, 1:15 PM

catlow wrote:

Sorry, the technical stuff is over my head, but can someone explain what the chances are of this being fixed? What effect do the patches referred to have? Can we expect members of redirected categories to show up on the target category page?

bzimport added a comment.Via ConduitDec 13 2008, 9:42 PM

Sebastian wrote:

Just came here from http://en.wikipedia.org/wiki/Wikipedia_talk:Categories_for_discussion#Category_redirects and wanted to add a vote for this bug.

Disclaimer: It's a long time that I last was here, and I didn't find a "vote" feature, as the Mozilla db has. Also, I didn't spend much time to understand the discussion of this and related bugs, and I don't understand the difference to bug 710, which is supposedly fixed.

bzimport added a comment.Via ConduitDec 13 2008, 11:10 PM

ayg wrote:

(In reply to comment #37)

I didn't find a "vote"
feature, as the Mozilla db has.

Bottom right, "Vote for this bug" (Ctrl-F for "vote" would have found it).

I don't understand the difference
to bug 710, which is supposedly fixed.

Bug 710 is about redirects to a category working when you navigate directly to the page, the same way they work for other pages. Prior to that bug's resolution, I guess "#REDIRECT [[:Category:XYZ]]" would do nothing, or be buggy or something (it was before my time). This is about redirects actually including one category's contents in another.

bzimport added a comment.Via ConduitFeb 3 2009, 2:27 AM

PhiLiP.NPC wrote:

may fixed in r46706.

bzimport added a comment.Via ConduitFeb 7 2009, 5:02 PM

catlow wrote:

This seems not to be working (at least, I just tried it on English Wikipedia and it didn't work - I don't know if it's supposed to be live there yet).

Platonides added a comment.Via ConduitFeb 7 2009, 5:05 PM

Wikipedia is still on r46424, see Special:Version.

bzimport added a comment.Via ConduitFeb 12 2009, 12:47 PM

catlow wrote:

I'm glad this is going to be solved soon, but there is the problem of potential exploitation by vandals. I've filed a new bug (bug 17461) to address this.

bzimport added a comment.Via ConduitFeb 18 2009, 3:01 PM

catlow wrote:

Don't like to sound impatient, but when can we expect this fix to actually come live?

bzimport added a comment.Via ConduitFeb 19 2009, 5:01 PM

catlow wrote:

OK, it is live, thanks. There still seems to be a slight problem, though, in that you can't get a list of members of the redirected category specifically. I've raised this in a new bug (bug:17571).

bzimport added a comment.Via ConduitApr 3 2009, 10:10 PM

ipatrol6010 wrote:

It has been decided that the change will be made in the next mediawiki full release ( http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/RELEASE-NOTES?view=markup ) , so just be patient.☺

tstarling added a comment.Via ConduitMay 19 2009, 8:06 AM

Reverted, see CodeReview r46706.

bzimport added a comment.Via ConduitMay 19 2009, 9:50 AM

catlow wrote:

Why has this potentially helpful change been reverted? It seemed to be working well; the only problem was bug 17571, which surely can't be difficult to fix. We know that the tables don't get updated straight away when a category is changed to/from a redirect, and I presume we wouldn't want them to. Bots would handle emptying existing categories when they get redirected, exactly as they do now.

tstarling added a comment.Via ConduitMay 19 2009, 11:47 AM

(In reply to comment #47)

Why has this potentially helpful change been reverted? It seemed to be working
well; the only problem was bug 17571, which surely can't be difficult to fix.
We know that the tables don't get updated straight away when a category is
changed to/from a redirect, and I presume we wouldn't want them to. Bots would
handle emptying existing categories when they get redirected, exactly as they
do now.

The tables indeed were not updated straight away, in fact they were not updated at all, ever. You'd have to have a bot go through and edit every page in the category, every time the redirect status or redirect target changed.

It's possible to do these updates immediately, with negligible performance loss, and to retire the bots. But it would be much more difficult to implement that feature if the categorylinks table was significantly polluted with spurious links from r46706.

bzimport added a comment.Via ConduitMay 19 2009, 12:34 PM

catlow wrote:

I don't think it was ever envisaged that the tables would be updated automatically (I didn't think that was desirable anyway, since inappropriate redirects of large categories, and subsequent reversions, would cause lots of extra processing, of the sort that doesn't seem to happen when e.g. templates with categories get updated). But if you say it can be done, then we'll wait in eager anticipation...

Catrope added a comment.Via ConduitMay 19 2009, 1:49 PM

(In reply to comment #49)

I don't think it was ever envisaged that the tables would be updated
automatically (I didn't think that was desirable anyway, since inappropriate
redirects of large categories, and subsequent reversions, would cause lots of
extra processing, of the sort that doesn't seem to happen when e.g. templates
with categories get updated). But if you say it can be done, then we'll wait in
eager anticipation...

Templates with categories don't cause immediate updates because those updates are put in the job queue and executed later. Presumably, updating for category redirect changes would also use the job queue.

bzimport added a comment.Via ConduitMay 19 2009, 2:09 PM

ayg wrote:

Templates with categories don't cause immediate updates because those updates require reparsing of large numbers of pages. Category redirects don't, I don't see any reason why they should need the job queue. Except for really giant categories, maybe, where you'd want to batch the updates to not lag the slaves.

Platonides added a comment.Via ConduitMay 22 2009, 6:02 PM

Making a "normal" Category a category can be done straight away, but unredirecting a category requires reparsing all category members.

bzimport added a comment.Via ConduitMay 22 2009, 7:19 PM

ayg wrote:

Or adding an extra column to categorylinks. That seems like a better idea, unless un-redirecting is expected to be very rare.

Platonides added a comment.Via ConduitMay 22 2009, 10:55 PM

That's probably the way to go. What would be that column?

bzimport added a comment.Via ConduitMay 24 2009, 1:56 PM

ayg wrote:

cl_to_original or such, an unredirected variant of cl_to. Then if a redirect chain changes, you could do UPDATE categorylinks SET cl_to='New_redirect_target' WHERE cl_to_original IN ('Original_category1', 'Original_category2');. You'd want an index on cl_to_original, of course, so this is a pretty heavyweight addition to the table.

bzimport added a comment.Via ConduitJul 11 2009, 9:37 PM

ipatrol6010 wrote:

I think that the best solution is you place [[A]] into [[Category:Foo]] and Foo redirects to Bar so you see [[A]] in [[Category:Bar]] and clicking on the catlink to Foo leads you to Bar. For commons they can have "co-categories" where a member of one co-category is visible in all other co-categories. This can be done by having all the categories have [[;Category:Fu]] [[;Category:Fuz]] [[;Category:Faz]] [[;Category:(...)]]

bzimport added a comment.Via ConduitNov 17 2009, 12:34 PM

catlow wrote:

Hello, is anyone still working on this? Any progress lately? It all seemed to be going so well at one point...

Jidanni added a comment.Via ConduitApr 1 2010, 4:38 AM

Sorry my following observation is probably noted above, but I didn't check.

On [[Page A]] put "[[Category:C1]]".
Now on [[Category:C1]] put
"#REDIRECT [[Category:C2]]".

Note how Page A is not listed on Category:C2.
Instead the only way to hunt down Page A in the categories is to
visit Category:1&redirect=no !

Jidanni added a comment.Via ConduitApr 1 2010, 4:40 AM

(In reply to comment #58)

visit Category:1&redirect=no !

I meant Category;C1&redirect=no. The redirect=no part is not something the average user will know to try. So the category entry is effectively lost in this sense.

Peachey88 added a comment.Via ConduitApr 30 2011, 12:10 AM

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

Bawolff added a comment.Via ConduitNov 8 2011, 9:13 PM
  • Bug 32262 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitNov 9 2011, 4:15 AM

sumanah wrote:

Adding the keywords that seem right -- if the patches still need reviewing, please change "reviewed" to "need-review".

Qgil added a comment.Via ConduitMar 23 2013, 6:24 PM

This feature request is being proposed at

http://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Automatic_category_redirects

and I'm considering whether to add it or not to

https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas

Question:

Is there a potential mentor willing to help potential students interested in
this project?

Is there a reasonable support from the MediaWiki core maintainers to incorporate this feature if it's developed and meets the quality criteria?

Without these qualifications in place we can't even consider the proposal for
GSOC 2013.

Bawolff added a comment.Via ConduitMar 23 2013, 6:26 PM

(In reply to comment #63)

Question:

Is there a potential mentor willing to help potential students interested in
this project?

Yes me :)

Is there a reasonable support from the MediaWiki core maintainers to
incorporate this feature if it's developed and meets the quality criteria?

I think so. Would require schema changes which is the only bit that could potentially be sticky.

Parent5446 added a comment.Via ConduitApr 23 2013, 6:38 AM

The more you know...

The current query for getting category members is:
SELECT ...
FROM page
INNER JOIN categorylinks

FORCE INDEX (cl_sortkey)
ON ((cl_from = page_id))

LEFT JOIN category

ON ((cat_title = page_title) AND page_namespace = '14')

WHERE cl_to = 'Test' AND cl_type = 'page'
ORDER BY cl_sortkey
LIMIT 201

And, true enough, if you change the cl_to check from a comparison to an IN operator, it triggers a filesort. *However*, if you instead move the contents of the WHERE clause into the INNER JOIN condition, then the filesort disappears. The resulting query is:

SELECT ...
FROM page
INNER JOIN categorylinks

FORCE INDEX (cl_sortkey)
ON ((cl_from = page_id) AND (cl_to IN ('Test')) AND (cl_type = 'page'))

LEFT JOIN category

ON ((cat_title = page_title) AND page_namespace = '14')

ORDER BY cl_sortkey
LIMIT 201

Now I'm not too much of an expert on databases, but theoretically this should produce the exact same results (since it's an INNER JOIN) but still be efficient (because the cl_sortkey index includes the cl_from and cl_to columns).

This would eliminate the need for any new columns and whatnot.

Qgil added a comment.Via ConduitApr 23 2013, 9:08 PM

Just a note to say that Liangent has applied to GSoC with a proposal related to this report. Good luck!

https://www.mediawiki.org/wiki/User:Liangent/cat-redir

Bawolff added a comment.Via ConduitApr 23 2013, 9:40 PM

Re comment 66:

If I have more than a single category in the IN condition when doing that, I get a filesort:

mysql> describe SELECT /* CategoryViewer::doCategoryQuery Bawolff */ page_id,page_title,page_namespace,page_len,page_is_redirect,cl_sortkey,cat_id,cat_title,cat_subcats,cat_pages,cat_files,cl_sortkey_prefix,cl_collation FROM page INNER JOIN categorylinks FORCE INDEX (cl_sortkey) ON ((cl_from = page_id) AND cl_to in ('Foo', 'se') and cl_type = 'page') LEFT JOIN category ON ((cat_title = page_title) AND page_namespace = '14') ORDER BY cl_sortkey LIMIT 2\G

  • 1. row ******* id: 1 select_type: SIMPLE table: categorylinks type: range

possible_keys: cl_sortkey

    key: cl_sortkey
key_len: 258
    ref: NULL
   rows: 559
  Extra: Using where; Using filesort
  • 2. row ******* id: 1 select_type: SIMPLE table: page type: eq_ref

possible_keys: PRIMARY

    key: PRIMARY
key_len: 4
    ref: wikidb.categorylinks.cl_from
   rows: 1
  Extra:
  • 3. row ******* id: 1 select_type: SIMPLE table: category type: eq_ref

possible_keys: cat_title

    key: cat_title
key_len: 257
    ref: wikidb.page.page_title
   rows: 1
  Extra:

3 rows in set (0.00 sec)

Parent5446 added a comment.Via ConduitApr 24 2013, 5:13 AM

Hmm, damn databases.

Parent5446 added a comment.Via ConduitApr 28 2013, 9:17 PM

Success! So the issue is that the cl_sortkey index on categorylinks puts the cl_to column before the cl_sortkey column, so when you add the "cl_to IN ...", it can no longer use the index to sort by cl_sortkey (from the ORDER BY clause).

After adding the following index:

ALTER TABLE categorylinks
ADD UNIQUE cl_newsort ( cl_type, cl_sortkey, cl_to, cl_from )

And then running the following query:

EXPLAIN EXTENDED SELECT cl_from
FROM categorylinks
INNER JOIN page ON
page_id = cl_from
LEFT JOIN category ON
cat_title = page_title AND
page_namespace = 14
WHERE
cl_type = 'page' AND
cl_to IN ( 'Foo', 'Test' )
ORDER BY cl_sortkey

I finally got no more filesort. (I was even able to get rid of the FORCE INDEX usage.) If somebody could please check this and make sure I'm still sane, and that MySQL isn't just inventing things to trick my mind, that'd be great.

Bawolff added a comment.Via ConduitApr 28 2013, 9:25 PM

I havent tested this, but I would guess that unless it is doing something very fancy with merging indecies, this would cause very large scans of the categorylinks table. (Since it wouldn't be able to skip to only results in the relavent category). filesort isnt the only way that a db query can be inefficient.

Parent5446 added a comment.Via ConduitApr 28 2013, 9:33 PM

(In reply to comment #71)

I havent tested this, but I would guess that unless it is doing something
very
fancy with merging indecies, this would cause very large scans of the
categorylinks table. (Since it wouldn't be able to skip to only results in
the
relavent category). filesort isnt the only way that a db query can be
inefficient.

Hmm, you're right. Now that I realize it, this would require scanning the entire cl_sortkey index (I think).

Qgil added a comment.Via ConduitMay 3 2013, 9:56 PM

Just a note to say that Andre Saboia has submitted a GSoC proposal related to this report: https://www.mediawiki.org/wiki/User:Anboia/Automatic_category_redirects

gerritbot added a comment.Via ConduitMay 24 2013, 7:44 AM

Related URL: https://gerrit.wikimedia.org/r/65176 (Gerrit Change I29a629a514f9568d0ee4d967c516dfd599dc11ba)

Aklapper added a comment.Via ConduitSep 26 2013, 2:37 PM

Tyler: The patch received a -1, do you plan to rework it?

Bawolff added a comment.Via ConduitSep 26 2013, 2:44 PM

If I ever have free time again (read: probably not for a while), I offer to help Tyler address some of the issues with the patch.

Parent5446 added a comment.Via ConduitSep 26 2013, 2:59 PM

(In reply to comment #76)

If I ever have free time again (read: probably not for a while), I offer to
help Tyler address some of the issues with the patch.

That would be great. I don't have much free time myself, although once I do I'll definitely work on it.

Bawolff added a comment.Via ConduitSep 26 2013, 3:00 PM

(In reply to comment #77)

(In reply to comment #76)
> If I ever have free time again (read: probably not for a while), I offer to
> help Tyler address some of the issues with the patch.

That would be great. I don't have much free time myself, although once I do
I'll definitely work on it.

yeah, somebody should make a graph of number of commits to mediawiki by volunteers vs when school semester starts.

Qgil added a comment.Via ConduitOct 24 2013, 8:35 PM

I'm delisting this project from https://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Automatic_category_redirects since it looks like you are almost there.

Bawolff added a comment.Via ConduitOct 24 2013, 8:37 PM

Remove milestone 1.22 - Given that this has somewhat stalled due to lack of time on the part of interested parties, seems unlikely it could possibly make it to 1.22.

He7d3r awarded a token.Via WebNov 24 2014, 11:58 AM
Liuxinyu970226 removed a subscriber: Liuxinyu970226.Via WebNov 28 2014, 3:57 AM
reeves87 removed a subscriber: reeves87.Via EmailNov 28 2014, 12:52 PM
Cenarium added a comment.Via WebDec 11 2014, 6:54 PM

T77903 isn't a duplicate of this task, it's more of a relatively easy fix until we get those long standing difficult core problems solved, which may not be before a very long time.

Nemo_bis awarded a token.Via WebDec 12 2014, 8:00 AM
Kozuch awarded a token.Via WebDec 17 2014, 8:30 PM

This task was mentioned in https://www.mediawiki.org/w/index.php?title=Outreach_programs/Possible_projects&oldid=1404823#Very_raw_projects as a possible candidate for Google Summer of Code or similar programs. Do you think it is a good candidate?

Qgil added a comment.Via WebFeb 11 2015, 1:44 PM

Wikimedia will apply to Google Summer of Code and Outreachy on Tuesday, February 17. If you want this task to become a featured project idea, please follow these instructions.

Qgil added a comment.Via WebFeb 17 2015, 9:39 AM

@Parent5446, this task is assigned to you. Do you want to work on it or propose it to the next GSoC/Outreachy round?

Qgil moved this task to Need Discussion on the Possible-Tech-Projects workboard.Via WebFeb 17 2015, 9:39 AM
Niharika added a subscriber: Niharika.Via WebMar 6 2015, 5:58 AM

@Bawolff @Parent5446 Has this already been done? If not, is there interest in pushing this for upcoming GSoC round?

Nemo_bis added a comment.EditedVia WebMar 6 2015, 9:32 AM

Has this already been done?

No.

Bawolff added a comment.Via WebMar 6 2015, 11:54 AM

There is a partial patch from way back. See comments on gerrit.

Niharika added a comment.Via WebMar 6 2015, 12:43 PM

(In reply to comment #63)

Question:

Is there a potential mentor willing to help potential students interested in
this project?

Yes me :)

Does this still hold true @Bawolff?

Bawolff added a comment.Via WebMar 6 2015, 3:18 PM
In T5311#1095413, @NiharikaKohli wrote:

(In reply to comment #63)

Question:

Is there a potential mentor willing to help potential students interested in
this project?

Yes me :)

Does this still hold true @Bawolff?

Not really. Like if there was a student who really super wanted to do this idea and no other, maybe, but generally I'd rather not mentor this idea, this round.

Qgil added a comment.Via Bulk EditSep 23 2015, 9:10 AM

This is a message posted to all tasks under "Re-check in September 2015" at Possible-Tech-Projects. Outreachy-Round-11 is around the corner. If you want to propose this task as a featured project idea, we need a clear plan with community support, and two mentors willing to support it.

Qgil added a comment.Via Bulk EditSep 23 2015, 9:36 AM

This is a message sent to all Possible-Tech-Projects. The new round of Wikimedia Individual Engagement Grants is open until 29 Sep. For the first time, technical projects are within scope, thanks to the feedback received at Wikimania 2015, before, and after (T105414). If someone is interested in obtaining funds to push this task, this might be a good way.

TasneemLo moved this task to Missing Mentors on the Possible-Tech-Projects workboard.Via WebOct 6 2015, 12:14 PM
01tonythomas added a subscriber: 01tonythomas.Via WebOct 19 2015, 9:33 AM

This is the last call for Possible-Tech-Projects missing mentors. The application deadline for Outreachy-Round-11 is 2015-11-02. If this proposal doesn't have two mentors assigned by the end of Thursday, October 22, it will be moved as a candidate for the next round.

Interested in mentoring? Check the documentation for possible mentors.

01tonythomas added a comment.Via Bulk EditOct 23 2015, 10:08 AM

As previously mentioned, this task is moved to 'Recheck in February 2016' as it doesn't have two mentors assigned to it as of today, October 23 - 2015. The project will be included in the discussion of next iteration of GSoC/Outreachy, and is excluded from #Outreachy-11. Potential candidates are discouraged from submitting proposals to this task for #Outreachy-11 as it lacks mentors in this round.

Antanana added a subscriber: Antanana.Via WebOct 25 2015, 10:11 AM
Agabi10 added a subscriber: Agabi10.Via WebDec 28 2015, 1:34 AM
Billghost added a subscriber: Billghost.Via WebThu, Feb 11, 11:40 AM
Billghost removed a subscriber: Billghost.
Billghost added a subscriber: Billghost.Via WebThu, Feb 11, 5:37 PM

Add Comment