Page MenuHomePhabricator

Interaction Timeline V1: Wiki field autosuggestion should suggest big wikis first
Closed, ResolvedPublic3 Estimated Story Points

Description

Problem

The auto-suggest for wiki does not give an intuitive order for the wikis


Expected behavior

The wiki autosuggestion should be ordered by the size of the wiki (page count, user count, or something else similar) so the most common customers of this tool will be able to quickly select their choice.

Use what is being used at https://tools.wmflabs.org/siteviews/


Examples

Event Timeline

TBolliger renamed this task from Interaction Timeline V1: Wiki selection should be better to Interaction Timeline V1: Wiki field autosuggestion should suggest big wikis first.Nov 9 2017, 12:31 AM
TBolliger created this task.
czar added a subscriber: czar.Nov 17 2017, 12:23 AM
TBolliger set the point value for this task to 3.

Basically we'll just need to switch to using the domain as the label rather than the title of the site. I think that should fix it.

dmaza claimed this task.Dec 8 2017, 10:15 PM
dmaza moved this task from Ready to In progress on the Anti-Harassment (AHT Sprint 11) board.
dmaza closed this task as Resolved.Dec 13 2017, 8:52 PM
dmaza moved this task from Review to Done on the Anti-Harassment (AHT Sprint 11) board.

I don't see this on production yet...

kaldari reopened this task as Open.Dec 13 2017, 9:58 PM
kaldari added a subscriber: kaldari.

Still seems to not be ordering the wikis in a useful way.

FWIW, pageviews seems to use a manually sorted list: https://github.com/MusikAnimal/pageviews/blob/master/javascripts/shared/site_map.js

The downside is that it has to be maintained manually, but that's probably less of an issue for Interaction Timeline. On smaller (new) wikis, the tool will likely be unnecessary due to the small number of edits/articles/users.

FWIW, pageviews seems to use a manually sorted list: https://github.com/MusikAnimal/pageviews/blob/master/javascripts/shared/site_map.js

The downside is that it has to be maintained manually, but that's probably less of an issue for Interaction Timeline. On smaller (new) wikis, the tool will likely be unnecessary due to the small number of edits/articles/users.

Agreed. It doesn't have to be 100% correct, just within the logical ballpark.

Testing on production right now at https://tools.wmflabs.org/interaction-timeline/ I see unhelpful results:

It was discussed in planning that all we were gonna do was to order by domain name. There is no quick way to identify the big wikis to our knowledge.
The other option is to keep a hard coded list. We could do that.

I see David's comment on Nov 17, but that's not how I understood the result would be. The description of the ticket states the problem and expected behavior, so we'll need to do more work on it.

It doesn't have to be perfect, just within the ballpark of 'more active = top of results' . We can use a hardcoded list or whatever https://tools.wmflabs.org/siteviews/ is using.

No problem. This is what siteviews has (link). I'll use the same.

No problem. This is what siteviews has (link). I'll use the same.

That is actually a whitelist of wikis that support pageviews, copied from GitHub (I should parse that file and cache it, but I digress). It is not the full WMF wiki farm. I doubt users of the Interaction Timeline would use it on any of the remaining wikis, though.

If you wanted to be comprehensive, there's probably a MediaWiki API to get the full list of wikis, or you could query the meta_p.wiki table.

One of the problems with pulling from the whole list and sorting alphabetically is that wikimedia comes before wikipedia, so you get wikimedia chapter sites first, which you probably don't want. And you also end up getting wikibooks before wikipedia.

The order of sites returned by the sitematrix API actually looks reasonable. I have no idea what's going on with the auto-suggest in the interface though. It seems to order results almost randomly. If you do end up keeping the sitematrix API, you'll want to disregard the entire "special" block from the results. That's the group for odd stuff like Ombudsmen Wiki, Project Grants Committee, Wikimedia Norway Internal Board, etc. Then you'll need to manually add back metawiki, commonswiki, wikidatawiki, and maybe specieswiki and sourceswiki. Should be fine to throw out the rest.

Based on the list in siteviews, the only difference from sitematrix is that wikipedia(s) and wiktionaries are first in priority.
I think I might have come up with a solution without having to hardcode the list.

The reason the auto-suggest pops things in front of what you are searching is because we are doing a fuzzy search and not a literal match. I'm chaning that now.

Here is the new wikis ordered list
https://gist.github.com/dayllanmaza/00289d436e34dbcb7dfc75c8cfcfd817

PR: https://github.com/wikimedia/InteractionTimeline/pull/24

@dmaza: That looks pretty reasonable. Don't forget to add commonswiki, metawiki, wikidatawiki, specieswiki, and sourceswiki (the "special" wikis that are actually used by the community).

@kaldari I'll add them in a bit. Didn't know they were not part of the "standard" list. Do you mind shading some light on why there is this "categorization"?

@dmaza: In the sitematrix, all the wikis are organized as either under a specific language or "special". Wikis like commonswiki, metawiki, wikidatawiki, and sourceswiki are "special" because they are multi-lingual wikis. Specieswiki is English and Neo-latin, but not specifically associated with English as there's only 1 WikiSpecies project.

Looks good on production.

TBolliger closed this task as Resolved.Jan 2 2018, 9:39 PM
czar added a comment.Apr 2 2018, 2:11 AM

This could be even simpler. What I was originally thinking: If editors A and B primarily edit in enwp, the field should autofill to enwp, even without the text prediction.

czar awarded a token.Apr 2 2018, 2:11 AM

This could be even simpler. What I was originally thinking: If editors A and B primarily edit in enwp, the field should autofill to enwp, even without the text prediction.

Yeah, we liked this idea too, but we opted for 'big wikis first' in case someone starts with the wiki field, and not the users. Maybe in the future.