Page MenuHomePhabricator

duplicate entries for the same course at Special:Courses
Closed, ResolvedPublic

Description

screenshot of duplicate course entries at Special:Courses

There are two entries for the same course at Special:Courses:

http://en.wikipedia.org/w/index.php?title=Special%3ACourses

Screenshot attached. The two "Molecular Biology" entries at the bottom both link to the same course:

http://en.wikipedia.org/wiki/Education_Program:Johns_Hopkins_University/Molecular_Biology_%28Spring,_2013%29


Version: master
Severity: minor

Attached:

Screenshot_from_2013-01-09_09:59:44.png (783×1 px, 246 KB)

Details

Reference
bz43782

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:30 AM
bzimport set Reference to bz43782.

Odd. has this happened for any courses other then the one mentioned?

Not that I've noticed, but I'll start keeping an eye out.

I got some hints at the issue from the user who created that course:

"I got database errors when, for example, I accidentally reloaded a page in the middle of the wizard. And then couldn't find out how to re-enter the flow."

So something to do with a partially-complete course creation that got reset in the middle?

Somehow creating a course twice would explain the issue. This should not be possible since the title is unique. There was however a title related bug in the creation process for courses, which I fixed quite a while back, but probably is not deployed yet or was not deployed yet at that point.

Assuming this is fixed now. Please reopen if you still run into it anywhere.

This course still shows up twice: http://en.wikipedia.org/wiki/Education_Program:Millersville_University_of_Pennsylvania/Intercultural_Communication_%28Spring_2013%29

See http://en.wikipedia.org/wiki/Special:Courses

It is listed once correctly, with 6 students and term 2013 Q1. And once incorrectly, with 0 students and term Spring 2013.

This is not an issue with the pager. Somehow a course with the same name managed to get created twice. I think this is fixed now.

Still stuck with the two courses now. The correct one has id 65 and the other one id 67.

Since the deletion API works with ids, I figured it safe to just remove the 67 one...

https://en.wikipedia.org/wiki/Special:Log/course

Think this is fixed now? :)

andrew.green.df wrote:

There are two entries for this course:
https://en.wikipedia.org/wiki/Education_Program:Graduate_Institute_of_International_and_Development_Studies/Gender_and_International_Affairs_%28Fall_2013%29

In the recent dump of EP tables, that course appears twice, with consecutive course IDs.

Three other courses also appear twice and have consecutive or nearly consecutive ids:

https://en.wikipedia.org/wiki/Education_Program:Oregon_Institute_of_Technology

https://en.wikipedia.org/wiki/Education_Program:Oregon_Institute_of_Technology/Electrochemistry_for_Renewable_Energy_Applications_%28CHE260,_Spring_2013%29

https://en.wikipedia.org/wiki/Education_Program:Rochester_Institute_of_Technology

https://en.wikipedia.org/wiki/Education_Program:Rochester_Institute_of_Technology/Hist_190:_American_Women%27s_History_%28Fall_2013%29

https://en.wikipedia.org/wiki/Education_Program:University_of_Sydney

https://en.wikipedia.org/wiki/Education_Program:University_of_Sydney/WRIT2002_Advanced_Writing_and_Research_%28S1%29

Here is a course that appears twice, but the ids are not consecutive:

https://en.wikipedia.org/wiki/Education_Program:Drake_University

https://en.wikipedia.org/wiki/Education_Program:Drake_University/Global_Youth_Studies_%28Fall_2013%29

There are a few other entries that probably are duplicates in that they refer to the same actual course, even though not all descriptive fields are identical.

Dumped table data is here:
https://bugzilla.wikimedia.org/attachment.cgi?id=13519

Thanks Andrew. I think what is happening in most cases is that instructors are pressing 'back' on their browsers or otherwise trying more than once to create the same course.

The consecutive ids are likely a coincidence because there were no other courses created in the meantime in most cases. According to the logs (which will not be complete for some courses, as one now-fixed bug resulted in no logs or page history entries for a while), all the consecutive ids were created for a second time the same day, from 0 to 10 minutes after the first creation.

The one with non-consecutive ids has creation events separate by a few weeks, and I suspect that the instructor manually added the same title and "re-created" their existing course (possibly in an effort to find their course page).

I just confirmed on test2.wikipedia that I can "create" the same course over and over again using the create form at Special:Courses (including by pressing back on the browser after submitting it), even though the existing content is loaded.

The interface message beneath the "Adding course..." page title also says (incorrectly): "There is no course with this name yet, but you can add it."

The correct behavior should be for the extension to recognize when someone is trying to "create" a course that already exists, and provide them with a link to the course instead of the create/edit form.

Actually, it's not even necessary to press create again... going back to the "create" form and pressing submit again is enough to create another id.

I also note that trying to 'create' an existing page results in a message at the top that "This course has been deleted. You can restore 4 revisions." The number of revisions alleged deleted gets incremented each time the course is re-created. However, the "restore" link results in this message:

"Failed to undelete course Columbia University/Lolcats: inter-disciplinary explorations (Squirrel Season 2012). It already exists."

See https://test2.wikipedia.org/w/index.php?title=Special:Log&page=Education+Program%3AColumbia+University%2FLolcats%3A+inter-disciplinary+explorations+%28Squirrel+Season+2012%29

andrew.green.df wrote:

Agreed about the problem and how to fix it!

I was just testing it, too, before I saw your last two posts---and your tests are more complete than mine. In any case, just in case it's useful, here is the run-down of what I did.

In a fresh MediaWiki-Vagrant install with the education role enabled:

1- Add an institution.
2- Fill out the form for adding a course on the institution page, then click on "Add course".
3- On the next page, add more info, click "Submit".
4- Click the browser's back button to return to the previous form.
5- Click "Submit" again.
6- Go to the institution page. You should see the same course twice. (If you didn't set the dates for the test course, you may have to set the filter to "Ended" status and click "Go".)

You can ssh into vagrant and look at the ep_courses table to see entries similar to those found in the production data.

Since I didn't mention it, just to make sure we're agreeing about the fix... there are actually two methods of failure that need to be fixed:

  1. Adding the same course name and term via one of the "Add course" forms (which should result in a message that the course already exists, with a link to it, and a note that the user can go there to edit it)
  2. Pressing 'back' to return to the 'creating course' form after pressing 'Submit', and then pressing 'Submit' again. Ideally, this would result in nothing happening if the content matches the already-submitted version, or saving a new revision of the existing course if the content is different. But simply doing nothing on the second 'submit' or reporting that 'this course has already been created' would also be fine.

andrew.green.df wrote:

OK, agreed on both proposals for fixing this.

I'm working on the second of the two first, hope that's OK.

Also noticed there's a related problem for institutions. Try this: create an institution, then press the "back" button, and press submit to create the same institution again. Just like in the case courses, the extension tries to create it again, but it fails because of constraints in the DB.

Change 91128 had a related patch set uploaded by AndyRussG:
Fix errors if course/org forms are re-submitted

https://gerrit.wikimedia.org/r/91128

Change 91128 merged by jenkins-bot:
Fix errors if course/org forms are re-submitted

https://gerrit.wikimedia.org/r/91128

The patch above closes the main entry point for duplicate courses -- double submitting through the 'create a course' workflow, but it is still possible to create duplicate courses by creating a course and changing its title before submission to collide with another existing course.

The bug should be completely resolved once we disable the renaming/retitling of course pages.

As of 1.23wmf15, courses cannot be renamed/retitled, so it should not be possible to create new instances of this bug. Duplicate entries that already exist will not go away on their own, but they could be removed if necessary using the delete-by-courseID API feature. I don't think this is necessary, though.