Page MenuHomePhabricator

Transferring data from Analytics API to Databox
Closed, DeclinedPublic

Description

Hello!

I have an urgent request that I hope can be marked as high priority.

I would like to transfer data for our Wikipedia project from the Analytics API to audience engagement's new tool, Databox. Here is the following data we are interested in:

  • pageviews
  • pageviews by country
  • unique devices
  • edits
  • New pages
  • new registered users

Integrating the Analytics API to Databox would have a tremendous impact for the entire communications department. Wikistats and Superset are very helpful with visualizations, but we need to integrate the Analytics API because:

  1. The clients (Directors & Managers) will be able to have one place to view all of their analytics instead of having to go to multiple sources to view the data. Databox allows us to integrate many different sources. (i.e. Having social media analytics, web traffic analytics, edits data, etc all in one place)
  1. Databox has features that allow us to set up goal tracking to measure our performance in achieving a goal. (i.e. a gas gauge indicator if we are close to accomplishing our goal to increasing readership) We also can set up daily and weekly scorecards. Furthermore, being able to set goal tracking to the data coming from the Analytics API will hold us more accountable for achieving the goals we have set under our team OKR's
  1. Also, it will help us with storytelling. We will be able to tell a better by having all of our data (social media, web traffic, campaign results, etc) displayed in one place.

Please let me know if you have any questions!

Event Timeline

CGlenn updated the task description. (Show Details)
CGlenn updated the task description. (Show Details)
CGlenn updated the task description. (Show Details)
CGlenn added a subscriber: mpopov.

Hi @CGlenn, thank you for opening this task and letting us know about its priority. We will triage this task during our next Board Refinement meeting on Tuesday 7/21 and provide you an update.

@CGlenn - this is a big request and is not something I believe Product Analytics will be able to support in FY20-21, even as I acknowledge the business value in having information more consolidated and easy to access. Below are a couple of the implications and barriers I'm weighing. I would be happy to meet with you and directors/managers to discuss further or explore possible alternatives.

We cannot import any Foundation-internal data into a 3rd party tool without full Privacy, Security, and Legal reviews. From the list you provided, it sounds like you're thinking about the public APIs (https://wikimedia.org/api/rest_v1/), which don't carry the same burden -- but since our reports in Superset rely on Foundation-internal data, I wanted to call this out.

Pipelines that are constructed with API data will require testing, maintenance, and updates, both from a technical perspective (APIs change, systems require updates) and from a data governance perspective. Both the Analytics Engineering and Product Analytics teams dedicate resources into setting up and maintaining Superset, making rich datasets available, monitoring data quality, ensuring datasets are consistent with key data definitions, and training staff across the Foundation. Adding a separate reporting tool with "duplicated" data adds substantial complexity to maintaining our data ecosystem and making sure datasets are consistent.

Thank you for this feedback, Kate!

I will share this with my team.

LGoto triaged this task as Medium priority.Aug 4 2020, 5:17 PM

Hey @CGlenn! I'm closing this task, but do let me know if we need to circle back on it and if your team wants a follow-up discussion.

@kzimmerman Thank you for closing the task! I will reach out if this conversation comes back up. :)