Page MenuHomePhabricator

[Spike 4hrs] Investigate grafana dashboard for PDF's
Closed, ResolvedPublic

Description

Timebox: 4hrs
Question: What are these graphs displaying and how are they displaying it?

background

we would like to track the changes in PDF rendering since the launch of electron on all projects. These are currently tracked in: https://grafana.wikimedia.org/dashboard/db/mediawiki-electronpdfservice?orgId=1, however, certain portions of the board don't quite make sense

acceptance criteria

Event Timeline

phuedx renamed this task from [Spike] Investigate graphana board for PDF's to [Spike] Investigate grafana dashboard for PDF's.Jul 18 2017, 4:35 PM
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)
ovasileva renamed this task from [Spike] Investigate grafana dashboard for PDF's to [Spike 4hrs] Investigate grafana dashboard for PDF's.Jul 19 2017, 12:43 PM

I'm curious, this task specifies a 4 hour window, which 4 hours in particular is the task talking about?

Provide a description of the variables in each graph in https://grafana.wikimedia.org/dashboard/db/mediawiki-electronpdfservice?orgId=1

Below is the description of each graph:

Selection screen views
The graph shows the number of clicks on the "Download as PDF" link in the sidebar. To be more precise, it's the number of times a user lands on 'Special:ElectronPdf' by clicking on the above mentioned link.

Redirect selections
After the user lands on 'Special:ElectronPdf' (by clicking on the 'Download as PDF' link), they see two options for downloading as PDF. The following events are captured depending on which option is selected:

  • redirect_to_electron - When the user clicks on the 'Single column' option and clicks on download.
  • redirect_to_collection - When the user clicks on the 'Two column' option and clicks on download.

Selection screen views (per wiki)
Similar to the "Selection screen views" board, but the data is split by wiki.

Redirect selections (per wiki)
Similar to the "Redirect selections" board, but the data is split by wiki.

PDF rendering over time total daily (all renderings)
The graph shows the total number of downloads of single article PDFs generated by the ElectronPdfService and OCG; and also the downloads of books generated by the Collection extension.

  • Electron Renders - On 'Special:ElectronPdf' when the user downloads a PDF using the 'Single column' option this metric is incremented by one.
  • OCG Renders - On 'Special:ElectronPdf' when the user downloads a PDF using the 'Two column' option this mentric is incremented by one.
  • Collection Renders - When the user enables the book creator from the sidebar and after adding pages to a book clicks on "Show book" on the top of the page, the user is taken to the "Manage your book" page. On the right, there is a "Download" section. When the user clicks on the "Download" button in the "Download" section, the user is take to a page where the render progress is shown. On that page when the rendering is finished and the "Download the file" link is visible this metric is incremented by one.

PDF rendering overtime daily (ElectronPdfService Extension only)
The graph shows the number of single article PDF downloads that's available via the 'Download as PDF' link in the sidebar, which takes the user to the 'Special:ElectronPdf' page.

  • Electron Renders - When the user downloads a PDF using the 'Single column' option this metric is incremented by one.
  • OCG Renders - When the user downloads a PDF using the 'Two column' option this metric is incremented by one.

Investigate whether it's possible to get the following:

  • OCG renders before and after to the switch
  • Collection renders before and after the switch
  • Electron renders after the switch

Yes, the current renders that are being recorded are visualized in the "PDF rendering over time total daily (all renderings)" graph (and other graphs too). OCG renders will decrease to zero when we sunset the feature. All renders will use Electron, but we'll still be able to distinguish between renders generated by the ElectronPdfService and Collection extensions.

I'm curious, this task specifies a 4 hour window, which 4 hours in particular is the task talking about?

It's a reminder for us not to spend more than 4 hours on the task and bring up any issues if the task is not resolved within that time.

@bmansurov this makes sense now that I have read the description of Spike! thanks.

however, certain portions of the board don't quite make sense

What about the board doesn't make sense? Does @bmansurov's investigation clarify the board at all?

@bmansurov

Provide a description of the variables in each graph in https://grafana.wikimedia.org/dashboard/db/mediawiki-electronpdfservice?orgId=1

Below is the description of each graph:

Selection screen views
The graph shows the number of clicks on the "Download as PDF" link in the sidebar. To be more precise, it's the number of times a user lands on 'Special:ElectronPdf' by clicking on the above mentioned link.

How are we measuring this? For example, for 7/30 at 16:00 we have 1.27K per minute and 5.25K per hour, which doesn't quite makes sense.

Redirect selections
After the user lands on 'Special:ElectronPdf' (by clicking on the 'Download as PDF' link), they see two options for downloading as PDF. The following events are captured depending on which option is selected:

  • redirect_to_electron - When the user clicks on the 'Single column' option and clicks on download.
  • redirect_to_collection - When the user clicks on the 'Two column' option and clicks on download.

Similarly, how often are we measuring here? For 7/30 at 16:00 we have 1.27K clicks per minute but only 576 Electron redirects per ??, which doesn't seem likely. Should that be 5.76K/hr?

Selection screen views (per wiki)
Similar to the "Selection screen views" board, but the data is split by wiki.

Redirect selections (per wiki)
Similar to the "Redirect selections" board, but the data is split by wiki.

PDF rendering over time total daily (all renderings)
The graph shows the total number of downloads of single article PDFs generated by the ElectronPdfService and OCG; and also the downloads of books generated by the Collection extension.

  • Electron Renders - On 'Special:ElectronPdf' when the user downloads a PDF using the 'Single column' option this metric is incremented by one.
  • OCG Renders - On 'Special:ElectronPdf' when the user downloads a PDF using the 'Two column' option this mentric is incremented by one.
  • Collection Renders - When the user enables the book creator from the sidebar and after adding pages to a book clicks on "Show book" on the top of the page, the user is taken to the "Manage your book" page. On the right, there is a "Download" section. When the user clicks on the "Download" button in the "Download" section, the user is take to a page where the render progress is shown. On that page when the rendering is finished and the "Download the file" link is visible this metric is incremented by one.

PDF rendering overtime daily (ElectronPdfService Extension only)
The graph shows the number of single article PDF downloads that's available via the 'Download as PDF' link in the sidebar, which takes the user to the 'Special:ElectronPdf' page.

  • Electron Renders - When the user downloads a PDF using the 'Single column' option this metric is incremented by one.
  • OCG Renders - When the user downloads a PDF using the 'Two column' option this metric is incremented by one.

What is the difference between these two? For example, on 7/29 at 0:00 we have 43.4 Electron and 2.7 OCG on the total daily graph, and 43.4 Electron and 4.1 OCG.


Investigate whether it's possible to get the following:

  • OCG renders before and after to the switch
  • Collection renders before and after the switch
  • Electron renders after the switch

Yes, the current renders that are being recorded are visualized in the "PDF rendering over time total daily (all renderings)" graph (and other graphs too). OCG renders will decrease to zero when we sunset the feature. All renders will use Electron, but we'll still be able to distinguish between renders generated by the ElectronPdfService and Collection extensions.

How are we measuring this? For example, for 7/30 at 16:00 we have 1.27K per minute and 5.25K per hour, which doesn't quite makes sense.

I suppose you're looking at the daily range where each point represents 15mins. If you zoom in, you'll see per minute data that's not summed over 15mins. That's why 1.27K is about 4 times less than 5.25K.

Similarly, how often are we measuring here? For 7/30 at 16:00 we have 1.27K clicks per minute but only 576 Electron redirects per ??, which doesn't seem likely. Should that be 5.76K/hr?

This is also per minute, but you have to zoom in. Also this is not per minute vs per hour. It's Collection vs Electron.

What is the difference between these two? For example, on 7/29 at 0:00 we have 43.4 Electron and 2.7 OCG on the total daily graph, and 43.4 Electron and 4.1 OCG.

By "these two" do you mean the last two graphs? In "PDF rendering over time total daily (all renderings)" the OCG metric is logged using the OCG service, while in "PDF rendering overtime daily (ElectronPdfService Extension only)" data is logged by ElectronPdfService.

By "these two" do you mean the last two graphs? In "PDF rendering over time total daily (all renderings)" the OCG metric is logged using the OCG service, while in "PDF rendering overtime daily (ElectronPdfService Extension only)" data is logged by ElectronPdfService.

Shouldn't they display the same results though?

Shouldn't they display the same results though?

Not necessarily because although we're measuring the same thing, we are doing so at different times and places. For example, the OCG metric gets incremented a request is made to render. It ignores request to re-render a collection. As for Electron, the metric is incremented when a request is made. Whether the actual request reaches the OCG backend (where the other metric is measured) is another question. There maybe other discrepancies between these two methods.

@bmansurov - makes sense, thank you. Closing this for now, although I do have one last question - do you think these discrepancies in the measurement method should cause this much variation? It makes me question the accuracy of either graphs.

I've created a new graph to compare the two OCG renders. I think the discrepancy is also coming from the fact that the OCG renders via electron only take single article renders into account. The other measure takes both collection and electron renders. The difference can be thought of as the number of collection renders.

Edit: I misspoke above. I've added more fields to the graph. Basically, the electron renders are only coming from single page articles, while OCG renders are coming from both electron and books.