Page MenuHomePhabricator

Provide export path for enwiki Gather users
Closed, ResolvedPublic

Description

Gather is about to be disabled, and a few users invested significant amount of work into creating Gather lists (some have 100+ items). If possible, provide a way for them to export their lists.

For public lists, just dump the data somewhere. For private lists, make sure only the owner can access.

Event Timeline

Tgr created this task.Feb 25 2016, 7:15 AM
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 25 2016, 7:15 AM
Tgr added a comment.Mar 11 2016, 6:24 PM

Plan is to:

  • copy the enwiki gather database on some Labs machine
  • filter out suppressed or otherwise improper rows
  • add central user IDs
  • write a minimal web interface with an authenticate button and an export button after successful authentication (maybe separately for public/private; maybe separately for each collection; maybe also have an "export to wiki" button - depends on how fancy we want it to be)
  • authenticate via an OAuth identity-only provider
  • convert user's collections to some useful format
  • provide the lists as a download and/or create them as a user page (the latter one is T129626)
  • send out another massmessage so that users can find their way to the download page

Plan is to:

copy the enwiki gather database on some Labs machine
filter out suppressed or otherwise improper rows
add central user IDs

The remainder is 'for-private' only

write a minimal web interface with an authenticate button and an export button after successful authentication >(maybe separately for public/private; maybe separately for each collection; maybe also have an "export to wiki" button - depends on how fancy we want it to be)

  • I think we can avoid fancy for now.

authenticate via an OAuth identity-only provider
convert user's collections to some useful format
provide the lists as a download and/or create them as a user page (the latter one is T129626)

  • will need an html or text download for the privates...

send out another mass message so that users can find their way to the download page

  • sounds right
Jhernandez raised the priority of this task from Lowest to High.Mar 11 2016, 9:17 PM

Plan is to:

  • copy the enwiki gather database on some Labs machine
  • filter out suppressed or otherwise improper rows
  • add central user IDs
  • write a minimal web interface with an authenticate button and an export button after successful authentication (maybe separately for public/private; maybe separately for each collection; maybe also have an "export to wiki" button - depends on how fancy we want it to be)
  • authenticate via an OAuth identity-only provider
  • convert user's collections to some useful format
  • provide the lists as a download and/or create them as a user page (the latter one is T129626)
  • send out another massmessage so that users can find their way to the download page

Sounds like a good test of OAuth.

Tgr added a comment.EditedMar 21 2016, 10:58 PM

staging.tgr_gather_user_requests in the analytics DB contains the list of usernames who requested their data (converted to DB format - first character in uppercase, spaces instead of underscores).

TSV dump of all public Gather data:

Test run of the export bot: User:Jkatz_(WMF)/Gather_lists; code is in P2798.

looks great to me!

Tgr added a comment.Mar 23 2016, 11:32 AM

@JKatzWMF is it OK to send this out then? Do you want to write the text of the user talk message?

Tgr added a comment.Mar 23 2016, 5:00 PM

On a closer look, still has problems:

  • encoding is messed up at least in the title (\u2013 appears literally)
  • nonfree images are causing problems. The script does filter them, so this is a problem with enwiki template metadata, but it should be improved before doing the export.

User talk message
"Thank you again for using Wikipedia's beta Collections feature. Because you indicated an interest in keeping your public collection(s), we have transferred any collections you created to a new url. We apologize for the delay. You can now find them [link|here]. This new page is not accessible via any menu, so we recommend bookmarking the page if you plan to return. Thank you for experimenting with us!"

@Tgr In this limited instance, I hope non-free images can sit in the username space...this is just an export of an earlier tool, not a practice moving forward.

Tgr added a comment.EditedApr 4 2016, 7:14 PM

Public collections exported, talk messages sent.

@Tgr Where can I see the messages. I looked at a few users on the list and none of them had it on their talk pages.

Tgr added a comment.Apr 6 2016, 3:51 PM

https://en.wikipedia.org/wiki/User_talk:Aaarton

It was sent to the 83 users who have requested export (and had something to export).

Apparently the non-free image detection is still not quite right - that's tracked in T131896.

Excellent! Thanks for confirming.

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptApr 19 2016, 6:22 PM
JKatzWMF closed this task as Resolved.Apr 19 2016, 6:24 PM
JKatzWMF claimed this task.

Public and private lists whose owners requested migration have all been exported. Those public collections now live on talk pages and private collections were emailed to owners (only those owners who had email attached to their wikipedia account- the vast majority).