Page MenuHomePhabricator

Add a link to let people query for "All wikipedias"
Closed, ResolvedPublic5 Estimated Story Points

Description

(From the user feedback interview.)

This would be useful. We don't want to do "All projects", just all wikipedias.

image.png (166×1 px, 38 KB)

Checking the "All wikipedias" checkbox will disable the input box above it, in JS.

Server-side we will discard if that input box supplies us anything. We probably want to store a value like "all" in ew_dbname field or we'll end up adding hundreds of rows for each event. This will probably change a few things in how event queries are done.
Also if someone unchecks "All wikipedias" and adds wikis manually, we should take that row out.

In the event page UI, only show wikis which have any data in pages created/improved. If not, don't display it.

Event Timeline

Niharika set the point value for this task to 5.Mar 28 2018, 11:48 PM

I thought this through, and while the checkbox seems OK, I had some concerns. If you have only the "All Wikipedias" checked, and no other wikis (which currently no others are supported anyway), when you browse to the form there will be an empty input. That looked weird to me.

So, I thought maybe we could have a "+ All Wikipedias" link, that when clicked will insert a wildcard *.wikipedia as the value (either creating a new input, or using the existing blank one). *.wikipedia will also show up in the autocompletion when typing in wikis.

Give it a try on the staging app (backend functionality isn't in place, so don't try to save :)

If there are existing Wikipedias in the list, they are left there. However when you save, it will ignore those and instead only save *.wikipedia, and when you return to the form you'll see only *.wikipedia listed.

There are other weird things, like clicking on "+ All Wikipedias" more than once will insert yet another input with *.wikipedia, but I think that's OK (the backend will remove duplicates). Hopefully it's still intuitive how it works.

How does this sound?

I thought this through, and while the checkbox seems OK, I had some concerns. If you have only the "All Wikipedias" checked, and no other wikis (which currently no others are supported anyway), when you browse to the form there will be an empty input. That looked weird to me.

So, I thought maybe we could have a "+ All Wikipedias" link, that when clicked will insert a wildcard *.wikipedia as the value (either creating a new input, or using the existing blank one). *.wikipedia will also show up in the autocompletion when typing in wikis.

That looks good to me. It seems like something people will quickly get the hang of once they start using it.

If there are existing Wikipedias in the list, they are left there. However when you save, it will ignore those and instead only save *.wikipedia, and when you return to the form you'll see only *.wikipedia listed.

How about adding a tooltip to that link - indicating what it does and subtly warning people that this will potentially increase time needed for computing their metrics.

There are other weird things, like clicking on "+ All Wikipedias" more than once will insert yet another input with *.wikipedia, but I think that's OK (the backend will remove duplicates). Hopefully it's still intuitive how it works.

Yeah, I think that's okay too.

Niharika renamed this task from Add a checkbox to query "All wikipedias" to Add a link to let people query for "All wikipedias".Apr 10 2018, 6:28 PM

This is finally ready for review! Please assist me in testing at https://tools.wmflabs.org/grantmetrics-test. Deliberate attempts to break things are encouraged.

I used the account There'sNoTime with the date range 2018-01-01 to 2018-01-31. There should be stats on three wikis -- en.wikipedia, es.wikipedia and simple.wikipedia. You can try it out with other stewards too, most of these accounts do a lot of cross-wiki work.

How this works on the backend is pretty complicated. I'll try to explain, should you be interested, and this understanding may help with testing:

  • There are now three types of wikis, but all are stored in the event_wiki table. The types are:
    • "family wikis", such as *.wikipedia, *.wiktionary (though currently only wikipedias are allowed)
    • "child wikis" which are part of a family. So if *.wikipedia exists on an event, and there is a fr.wikipedia stored as an event_wiki for that same event, it's considered a child of the *.wikipedia record.
    • "orphan wikis" which are not part of a family. So say *.wikipedia exists on the event, and also fr.wikipedia and de.wiktionary. Only de.wiktionary is considered an orphan, if that makes sense. So if there are no family wikis on an event, and say there is fr.wikipedia and en.wikipedia, they are considered orphans and not children (since there is no associated family wiki).
    • For the record, I originally had considered creating separate models for each type of wiki, but this turned out to be more complicated. Instead we have the above semi-hacky way of determining the wiki types, relying on methods in the models that look for *. in the domain name, etc.
  • When you add a family wiki to an event, there initially are no records for the child wikis. Instead, we create them when generating the stats, that way you can see the per-wiki stats of only the child wikis for which there are stats.
  • Only child and orphan wikis are shown on the event page. E.g. en.wikipedia and fr.wikipedia, there are no stats shown for *.wikipedia, but since we only support Wikipedias right now the "Totals" row is effectively the same as the stats for *.wikipedia
  • If orphan wikis exist on the event (say en.wikipedia, fr.wikipedia), and you update the event to have a family wiki (*.wikipedia), the original two wikis and all of their associated stats are erased. Regenerating the stats will restore them if there are any stats to report for those wikis.

I've added a lot of tests for the new methods in the models, but once we allow non-Wikipedias (T190461), T186917 will become especially important.


A few other related changes that I've made:

  • Remove all stats and clear the "updated at" attribute when an event is edited (285684c). This is a necessary consequence of how the "family wiki" functionality works (since "orphan" and would-be "child" wikis need to be erased once a family wiki is added). I think is OK because after updating an event, the stats will be stale anyway. Also some things like the number of participants are always current, while the other stats are not. I think if an event is updated, and there are existing participants, we might fire off a job to generate the stats automatically. Saving that for a different PR.
  • If no statistics have been generated, redirect to the event page when attempting to view the revision browser. Again because of the way "family wikis" work on the backend, individual wikis might be erased after updating the event, in which case the revision browser will error out because it doesn't have any wikis to query.
  • Reflect job status when on event page, see T188368 (a14665e)
  • Show error message when the "Calculate totals"/"Update stats" button fails.

Pull request, to be merged once we're satisfied everything is stable: https://github.com/wikimedia/grantmetrics/pull/69

@Niharika You said you were able to produce a 500 error when adding multiple *.wikipedia's, via the "All Wikipedias" link? Could you give me an exact 1, 2, 3 of the steps you took? I still cannot reproduce :(

@MusikAnimal Here's the steps:

  1. Go to 'Create a new event'
  2. Enter a title for event
  3. Hit "All Wikipedias" thrice
  4. Save --- you see the 500 error.

I'm on Safari though it shouldn't matter. I just did it once more.

Ah, that's it! I was trying to repro when updating an existing event.

@Niharika That bug should be fixed now :)

Have you noticed any other weirdness?

Nothing else yet. I'll be giving it another whirl today and if everything looks good, we can move it to prod.

Niharika moved this task from In progress to Done on the Grant-Metrics board.