Page MenuHomePhabricator

Investigate behaviour tracking software options [SPIKE] [2hr]
Closed, ResolvedPublicSpike

Description

In parallel to designing personas for the library (T265000) we want to come to an understanding of how users are using the Library Card platform today, and get a baseline for some data we're interested in tracking across the design improvements.

We want to understand which software we should use to achieve this.

We did, at one time, install Matomo (T209831).

  • Is Matomo still a good choice for collecting this data, or would we be better served with something else?

Event Timeline

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptOct 8 2020, 10:03 AM

Additional context: we chose matomo previously because it is a widely used package with reasonably good documentation that can be configured either for JavaScript tracking or log parsing.

Noting that once we've implemented this software we can request data analytics support for understanding what we're seeing.

Looking through the options, I still think Matomo is probably the best choice; it has a very similar architecture and feature set to other popular open source alternatives such as Open Web Analytics, PostHog, but it looks like the analytics engineering team has direct experience with Matomo, and is already running an instance on behalf of small project sites:
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Matomo
I wonder if we might be able to request access to the instance they are already running? They have notes for their folks on how to add new teams.

Also, while there are differences between the alternatives, we haven't really identified an special analytics needs that would really push one above the other.

it looks like the analytics engineering team has direct experience with Matomo, and is already running an instance on behalf of small project sites: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Matomo

Oh, neat. I logged in and it's being used for Event Metrics, another VPS tool, so it should be feasible to use it for Library Card. I'll enquire.

I spoke briefly with @elukey on IRC, who explained that it would be helpful to have a more accurate estimate of our current daily visitors, to understand if that Matomo instance will work for our scale - it's designed to be used for very low traffic sites, which we may fit, at least for now.

I estimated ~100 users per day based on us having 50 users per day going through the proxy part of the tool, and some unknown-to-me number browsing the site elsewhere. @jsn.sherman would it be possible to get a more accurate estimate for the tool's usage in the past, say, week? Either daily users or page requests per time period?

We then unfortunately have the as-yet unknown effects of rolling out the user notification. T271962 should help us estimate the degree to which we expect our daily pageviews to change.

Based on looking through requests, I'd say that we have about 50 daily users because that count lines up nicely with daily hits to /oauth/callback/ (which is requested as part of site login). That number is relatively consistent over several weeks, indicating a steady rate of users logging back in after session expiration and new user activity. We currently see between 3k - 9k daily hits, excluding api requests, which wouldn't include the beacon. Given their listed ceiling of 10k daily requests per site, my expectation is that we would exceed their capacity if the echo notification grows the user base significantly, but we can wait for the data to come back from T271962.

I expect the notification to at least double our user base (a more useful estimate is probably 5-10x) so with our current rate of daily hits it seems like we should use our own instance.

@jsn.sherman What do we need to consider for this?

  • Setting up the instance
  • Adding some tracking to Library Card to link it

Anything else/more specific?

Would a TOS amendment be required?

@jsn.sherman What do we need to consider for this?

  • Setting up the instance
  • Adding some tracking to Library Card to link it

Anything else/more specific?

Those are the technical tasks, yep.

Would a TOS amendment be required?

This is exactly where we stalled before. We got those technical steps done and the thing deployed in staging, but then it wasn't clear what we needed to do in terms of informing users.

Besides technical pieces we need to work with Legal to understand the TOS question, and get this set up on Staging and configure it there.

I've emailed Legal asking what we need to consider in terms of data privacy/retention/acceptance.

Moving this to Done pending Legal's response. I'll file follow-up tasks when we have that.