Page MenuHomePhabricator

Enable the TagManager plugin for Matomo
Closed, ResolvedPublic

Assigned To
Authored By
BTullis
Oct 27 2023, 2:15 PM
Referenced Files
F41524985: image.png
Nov 23 2023, 12:43 PM
F41524944: image.png
Nov 23 2023, 12:20 PM
F41524937: image.png
Nov 23 2023, 12:20 PM
F41524932: image.png
Nov 23 2023, 12:20 PM
F41521977: image.png
Nov 20 2023, 2:36 PM
F41521974: image.png
Nov 20 2023, 2:36 PM
F40521558: image.png
Oct 27 2023, 2:15 PM
Tokens
"Like" token, awarded by sguebo_WMF.

Description

We have received a request from the WMF-Communications team to enable the TagManager plugin on our Matomo instance.

The initial use-case is for managing the tracking data of the wikimediafoundation.org website, although it might be used more widely across other sites on Matomo.
The TagManager functinoality is already built into our current version of Matomo, so it is only a matter of enabling the feature.

This ticket is tracking that request and any changes made as a result.

In addition to the TagManager plugin, we have an outstanding request to enable the Marketing Campaigns Reporting plugin, under T319013: Enable the Marketing Campaigns Reporting plugin for matomo.
We should review whether or not that is still desirable and whether it should be expedited.

As a side-note, I have been sharing information with @SCampos-WMF about the Metrics Platform and pointing out that this is likely to be the preferred framework for instrumenting all of the Foundation's websites, over time. Given this FAQ entry and this ticket: T318832: Modify the JS Client to be used non MW powered sites, right now might be a good opportunity to start thinking about getting wikimediafoundation.org onboarded with the Metrics Platform.

In the meantime, the effort to enable the Tag Manager functionality in Matomo is low and the benefit to the communications team could be large.

image.png (653×1 px, 107 KB)

Event Timeline

Change 969341 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Enable the TagManager plugin functionality on Matomo

https://gerrit.wikimedia.org/r/969341

The patch is ready to go. I'm just going to try to ascertain if there are any concerns about enabling the plugin by asking.

Adding Security-Team to request their review.
Has your team any concerns about our enabling the TagManager plugin for Matomo?
Thanks.

sbassett added a subscriber: sbassett.

Adding Security-Team to request their review.
Has your team any concerns about our enabling the TagManager plugin for Matomo?
Thanks.

Has the code changed significantly in the last 6 months or so? If not, this would likely only warrant a privacy review - I'll tag Privacy Engineering so it gets in their queue.

Adding Security-Team to request their review.
Has your team any concerns about our enabling the TagManager plugin for Matomo?
Thanks.

Has the code changed significantly in the last 6 months or so? If not, this would likely only warrant a privacy review - I'll tag Privacy Engineering so it gets in their queue.

No, the TagManager plugin was already a part of the Matomo codebase when it was originally installed. Thanks, we will wait on the privacy review.

Hi, all — I’ll share here a joint privacy review of the two proposed changes: enabling the TagManager and the Marketing Campaign Reporting plugins, as well as a succinct privacy risk assessment of the self-hosted Matomo instance, although it wasn’t specifically requested.

  • T349910 aims at enabling the TagManager plugin for the Foundation’s Matomo instance. Enabling the Tag Manager plugin is not introducing any new privacy risk per se. Once enabled, the plugin will function as a "black box" allowing and containing all the analytics tags that will be created by additional plugins such as Marketing Campaigns Reporting.
  • T319013 would enable the Marketing Campaigns plugin on that self-hosted Matomo instance, with the direct aim of tracking analytics information for the SoundLogo website using UTM parameters. The parameters proposed (utm_campaign, utm_source, and utm_medium) are not expected to store any identifying information. The only concerning data would be the utm_source parameter as it relates to the URL from which visitors came from. However, the plugin’s documentation suggests that full URLs will not be processed. Instead, only the refer’s domain will be captured. Furthermore, an initial inspection of the plugin’s codebase points that the information collected as tags (or UTM parameters) will not be shared with any external parties; data will remain within the Foundation’s premises.
  • The privacy risk assessment of the self-hosted Matomo instance helped surface some good signals. The Foundation uses a self-hosted instance to gather statistics on a small number of its non-Mediawiki websites, including Wikipedia15, Wikimedia Foundation, Techblog, and the Developer Portal. The instance is restricted to staff and NDA’ed users, and sheltered by two authentication layers — LDAP credentials and Matomo specific user+password combination. By default, analytics data is only stored within the Foundation infrastructure. Additionally, the instance implements IP address anonymization with a 16-bit mask, making it difficult to infer even city-level information, though it’s still possible to determine the visitor’s country. While the Matomo instance collects some identifying information — ID, country details inferred from the masked IPs, User Agent, full referrer URL, the independence from third-parties, access control, and IP-masking help reduce the overall risk to a LOW level.

In light of the above, the privacy risk associated with the two proposed changes was categorized as LOW[1]. That being said, it is recommended that the AppSec team double check the codebase of the MarketingCampaignReporting plugin, if capacity allows it.

[1] Internal threat modeling documentation : https://docs.google.com/spreadsheets/d/1p1KxIHLBdbJzrEBNqPAK-UwqJgeQI6XPX2i8giX2ywQ/edit#gid=32059142

Change 969341 merged by Btullis:

[operations/puppet@production] Enable the TagManager plugin functionality on Matomo

https://gerrit.wikimedia.org/r/969341

BTullis moved this task from Blocked / Waiting to Done on the Data-Platform-SRE board.

@sguebo_WMF thanks so much for your input. I've gone ahead and enabled the TagManager plugin, so I can mark this ticket as resolved.

I'll make reference to your comments above on T319013: Enable the Marketing Campaigns Reporting plugin for matomo and, as you suggest, seek a security review.
We may well be planning a Matomo version upgrade we well in the near future, so perhaps these can be rolled into one review.

In light of the above, the privacy risk associated with the two proposed changes was categorized as LOW[1]. That being said, it is recommended that the AppSec team double check the codebase of the MarketingCampaignReporting plugin, if capacity allows it.

Looking at the TagManager and MarketingCampaignsReporting plugins for Matomo, the AppSec team could likely perform some basic third-party / vendor reviews of the code, but that's about the most we are able to do for massive vendor codebases. Those requests would need to be filed separately via our Application Security Reviews process.

Change 975058 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Configure Matomo's TagManager to write to existing tmpdir

https://gerrit.wikimedia.org/r/975058

Change 975058 merged by Btullis:

[operations/puppet@production] Configure Matomo's TagManager to write to existing tmpdir

https://gerrit.wikimedia.org/r/975058

Change 975311 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Fix an issue with the matomo TagManager configuration

https://gerrit.wikimedia.org/r/975311

Change 975311 merged by Btullis:

[operations/puppet@production] Fix an issue with the matomo TagManager configuration

https://gerrit.wikimedia.org/r/975311

I'm reopening this ticket, since we have noticed that the plugin does not yet function correctly.

When a tag is created a new JS file is created on the matomo server, which is supposed to be downloaded by the client.
Here is a snippet from a page containing a new Tag, in preview mode.

image.png (124×969 px, 34 KB)

However, when I try to download that file with curl I get a 302 redirect to the IDM system.

image.png (169×1 px, 38 KB)

I believe that we need to update the Apache configuration to permit this type of request without redirecting to the IDM.

BTullis triaged this task as Medium priority.Nov 22 2023, 10:51 AM
BTullis moved this task from Done to In Progress on the Data-Platform-SRE board.

Change 976686 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Add another public endpoint to our matomo installation

https://gerrit.wikimedia.org/r/976686

I have added the required configuration to our apache2 site template for matomo in
976686: Add another public endpoint to our matomo installation | https://gerrit.wikimedia.org/r/c/operations/puppet/+/976686

https://matomo.org/faq/on-premise/how-to-configure-matomo-for-security/#other-tips

The requirement is listed here: https://matomo.org/faq/on-premise/how-to-configure-matomo-for-security/#other-tips

Change 976686 merged by Btullis:

[operations/puppet@production] Add another public endpoint to our matomo installation

https://gerrit.wikimedia.org/r/976686

Oh, it looks like that's not working. I deployed that change, but it seems that I am still receiving 302 redirects for that URL.

btullis@marlin:~$ curl https://piwik.wikimedia.org/js/container_HBF2fCC3.js
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://idp.wikimedia.org/login?service=https%3a%2f%2fpiwik.wikimedia.org%2fjs%2fcontainer_HBF2fCC3.js">here</a>.</p>
</body></html>

This is the log entry.

2023-11-22T13:52:11	616	2620:0:861:103:10:64:32:137	-/302	286	GET	http://piwik.wikimedia.org/js/container_HBF2fCC3.js	-	text/html	-	<my IP address>, 10.80.0.14	curl/8.2.1--	-	-	2620:0:861:103:10:64:32:137	ce3cda02-1b91-4e48-9e7e-0ac74635460f	91.135.7.94

I'll continue to investigate.

Change 976750 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Fix Matomo TagManager functionality

https://gerrit.wikimedia.org/r/976750

Change 976750 merged by Btullis:

[operations/puppet@production] Fix Matomo TagManager functionality

https://gerrit.wikimedia.org/r/976750

There has been an improvement, but it's still not working correctly.
Here's a screenshot from the page with the preview container embedded.

image.png (1×1 px, 207 KB)

We can see that the first javascript container has been retrieved and has executed, but there is no CSS and I'm not sure that it would function correctly.
For comparison, here is another screenshot of what a working container preview should look like.
image.png (2×3 px, 2 MB)

I can see that there are at least two things going wrong.
The first is that the Content-Security-Policy seems to have something to say about an additional resource.
image.png (596×1 px, 289 KB)

The second is that the plugin attempts to retrieve some specific URLs from Matomo and is redirected to the single-sign-on mechanism.
I can update the Apache configuration to allow those, but the CSP is outside of our control, I believe.

Change 977057 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Matomo: permit public retrieval of specific CSS and JS files

https://gerrit.wikimedia.org/r/977057

Hi @SCampos-WMF

I've tested the settings in https://gerrit.wikimedia.org/r/977057 manually, and they seem to be working, but the stylesheet is still blocked on the site's CSP.

image.png (300×1 px, 112 KB)

The current security policy seems to be:

default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://piwik.wikimedia.org https://stats.wp.com https://pixel.wp.com https://www.youtube.com https://player.vimeo.com http://localhost https://localhost http://localhost:8080; frame-src 'self' https://www.youtube.com https://player.vimeo.com; style-src 'self' 'unsafe-inline'; img-src 'self' data: https://piwik.wikimedia.org https://wikipedia.org https://upload.wikimedia.org; font-src 'self' data:; connect-src 'self' wss://public-api.wordpress.com https://*.wikipedia.org

I think that you would need to add https://piwik.wikimedia.org to the style-src section in order to have this work, but I believe that it outside of our control.

It's notable that the https://pixel.wp.com request is also refusing to load because it's not permitted by the img-src policy, but I don't believe that's relevant to this particular ticket. That's a different mechanism from a third-party.

Apart from the style sheet in preview mode though, I think that the TagManager functionality should now be working. Please feel free to let me know how you get on with testing it and updating the CSP of the site(s).

Hey @BTullis, thanks for addressing this issue! I'll generate a ticket and share it with our technical partners to address it in our current Sprint. I'll keep you updated on the progress.

Change 977057 merged by Btullis:

[operations/puppet@production] Matomo: permit public retrieval of specific CSS and JS files

https://gerrit.wikimedia.org/r/977057

BTullis moved this task from In Progress to Done on the Data-Platform-SRE board.