Page MenuHomePhabricator

Create data export page
Closed, ResolvedPublic8 Estimated Story Points

Description

This is a slightly tricky thing to do. Here's the proposed mockup - link.
Essentially, for each edit that was made during the event window, by the event participants, on the given wikis, we'd need to fetch -

  • timestamp
  • wiki
  • article
  • username
  • edit summary
  • link to diff

We need to offer them download links (as csv and wikitext table markup) for this data.
Currently, when we compute statistics (page created/improved), we don't keep track of the edits that were made while doing that.

We could either keep track of this data when we do the total calculation and store it in a database.
Or we could generate it on the fly when the user hits "Export data", and show them a progress bar etc.
In the latter case, we can hold on to the data in PHP after showing them the pretty-version and let them download wikitext/csv versions when they click download.
More ideas?

Event Timeline

Niharika added a subscriber: MusikAnimal.

@MusikAnimal I want to hear your thoughts on this.

Yeah that's a doozy! I would fetch the data when requested. For the view it should go reasonably fast with pagination, at least if we're only looking at a handful of wikis. Downloading all the data may take longer but I think in that case it's acceptable to have to wait. Do you know how many revisions we'd typically see for any given event?

Do you know how many revisions we'd typically see for any given event?

That's hard to estimate. Say we have 50 participants for an event and every participant makes 10 edits per day for about 10 days so ~5000?
Of course that number will vary widely with when it's requested, participant size, event size etc. I'm just making a guess here. :)

We currently don't have a way to concretely estimate these numbers because Wikimetrics doesn't provide us any, AFAICT.

I think it's acceptable to have to wait too. We should fetch when requested.

Niharika set the point value for this task to 8.Feb 6 2018, 11:56 PM
Niharika moved this task from Needs Discussion to Up Next (May 6-17) on the Community-Tech board.

@Niharika Regarding the "Download as wikitext" feature, are we sure we want to export every revision? That would result in a lot of markup, that could potentially be too much for a wiki page. You might instead produce a table showing only the per-wiki statistics and totals, or maybe limit it to that if there are more than N total revisions.

@Niharika Regarding the "Download as wikitext" feature, are we sure we want to export every revision? That would result in a lot of markup, that could potentially be too much for a wiki page. You might instead produce a table showing only the per-wiki statistics and totals, or maybe limit it to that if there are more than N total revisions.

You're right that there should be a sane limit but I don't know what it should be right now. I'd say for v1 we do all the revisions and after beta testing we figure out the limit. I'm adding this as a question for Sati.

Another question... how do we navigate to the "event data" page? I'm thinking next to the "Calculate" button there is a "View revisions" or "View full data" button, which makes it clear you'll be seeing all the revisions within the Grant Metrics interface. This also bundles all the stats-related actions together, since they'll all be disabled while calculations are in progress (otherwise exporting or pagination could suddenly give you different data, whilst hogging up our quota).

Another question... how do we navigate to the "event data" page? I'm thinking next to the "Calculate" button there is a "View revisions" or "View full data" button, which makes it clear you'll be seeing all the revisions within the Grant Metrics interface. This also bundles all the stats-related actions together, since they'll all be disabled while calculations are in progress (otherwise exporting or pagination could suddenly give you different data, whilst hogging up our quota).

Ah, sorry for not making that clear. The event data page is what you see what you click on the "Export" button on the event stats page - the one next to Calculate totals. Although I like your wording for it better. We should reword "Export" to "View all data" and keep the icon we have on it.

Ready for review! Everything seems to be going surprisingly fast, even the export options, but I haven't done any sort of load testing.

"Download as CSV" downloads an actual file, but for the wiki table I had it open the markup in a new tab. That way they can copy/paste to the wiki. On that note, we could even copy it to the clipboard for them, if we wanted!

Relevant commits: https://github.com/wikimedia/grantmetrics/compare/10f38f7695950bfa95c10867d3b193b164964d48...85459446c1a2a7689dd8a791d0f137a9411d50b7

This looks great! Thank you! Danny and I just tested it in a meeting and I was surprised to see both the CSV and the wikitable working perfectly. Having it open in a new tab is perfect. That's how I imagined it working. We can skip copying it to clipboard directly for now and maybe add that later when asked for.
I created two tickets - T187392: BUG: Pagination doesn't actually work on data page and T187394: Aesthetic changes to 'view all data' page for doing some follow-up work.