Page MenuHomePhabricator

Instrument the landing page
Closed, ResolvedPublic1 Story PointsOct 30 2018

Description

Placeholder to track work on instrumenting events in the landing page

Spec
URL: bienvenida.wikimedia.org

Details

Due Date
Oct 30 2018, 7:00 AM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Prtksxna updated the task description. (Show Details)Oct 17 2018, 5:18 AM

@atgo we have the most events down for instrumentation. to collect data about the visitor we need to know what demographics are needed to be captures. i.e. device info, country etc. would you list them down here so we can make sure we can get those.

other events are marked on invision

Prtksxna changed Due Date from Sep 30 2018, 7:00 AM to Oct 30 2018, 7:00 AM.Oct 18 2018, 1:53 AM

@Nirzar

  • country
  • device (mobile/desktop)
  • browser
  • We'll also want to know where they came from (referrer). We could pass a custom parameter if useful. There will be multiple channels for the video and we'd want to see differences between them.

@Nuria could we get a Piwik instance for this microsite?

Apart from the demographic stuff that @atgo mentioned we also need to track clicks to the four links that would be on the website (test instance), ie on:

  • Lee más
  • es.wikipedia.org
  • Apple app store
  • Google play store

Would this be possible using event tracking? Could you guide on how this can be achieved?

Just let us know what the domain/name of the site is, when it's up, and we can make you a little tracking code. Also, do you have any estimates of traffic?

Milimetric moved this task from Operational Excellence to Incoming on the Analytics board.

As far as traffic estimates, industry standards x our view targets for the
videos says about 100k clickthroughs during the campaign.

Nuria added a comment.Oct 22 2018, 8:06 PM

@atgo: 100K clickthrough over what period of time? What is the top level site?

~1 month. This is on es.wiki - I'm not sure if that's what you mean by top
level site.

Nuria added a comment.Oct 22 2018, 8:20 PM

@atgo, let me understand, we thought this was a microsite, like http://transparency.wikimedia.org or http://reserach.wikimedia.org which is what we normally measure with piwik, is that not the case? Is this a specific page in a wiki?

Yes, it's supposed to be a page from es.wiki. I believe the URL we'd
settled on was http://es.wikipedia.org/bienvenida

Nuria added a comment.Oct 22 2018, 8:54 PM

I see, I think there is a missunderstanding, piwik is not used on wikis, it is designed for low traffic sites like http://transparency.wikimedia.org , what we call "microsites".

If the page you are creating is hosted in es.wikipedia.org it will be captured by the regular workflow of pageviews and its pageviews will be visible in the pageview tool. An example of querying for a different page in es.wikipedia: https://tools.wmflabs.org/pageviews/?project=es.wikipedia.org&platform=all-access&agent=user&range=last-month&pages=Ronaldo

The data that will be collected long term includes user agent (parsed) and country and city of origin and whether the referrer is internal or external.
The short term data (held for 90 days), will include more specific referrers. The detailed data per page is available via command line tools, aggregated views per day will be available in the tool I linked above as soon as the page is up (well, allow one day of processing).

Now, if you want to instrument any links on those pages, your team will need to do it via eventlogging. Hopefully this makes sense.

I see, I think there is a missunderstanding, piwik is not used on wikis, it is designed for low traffic sites like http://transparency.wikimedia.org , what we call "microsites".
If the page you are creating is hosted in es.wikipedia.org it will be captured by the regular workflow of pageviews and its pageviews will be visible in the pageview tool. An example of querying for a different page in es.wikipedia: https://tools.wmflabs.org/pageviews/?project=es.wikipedia.org&platform=all-access&agent=user&range=last-month&pages=Ronaldo
The data that will be collected long term includes user agent (parsed) and country and city of origin and whether the referrer is internal or external.
The short term data (held for 90 days), will include more specific referrers. The detailed data per page is available via command line tools, aggregated views per day will be available in the tool I linked above as soon as the page is up (well, allow one day of processing).
Now, if you want to instrument any links on those pages, your team will need to do it via eventlogging. Hopefully this makes sense.

@Nirzar @Prtksxna see above. Do you have what you need?

Milimetric added a comment.EditedOct 24 2018, 6:22 PM

EDIT: never mind, disregard, wasn't in the loop. Piwik it is then :)

Well, but http://es.wikipedia.org/bienvenida is not a normal wiki URL, like http://es.wikipedia.org/wiki/PageWithPageviews. So it won't be tracked by the pageview tool, it would only be available in webrequest. I see two kinds of data being requested here. One set from @atgo that would all be available in webrequest:

country, device, browser, referer

And another set from @Prtksxna that includes link click tracking. That would not be possible from webrequest, it would only work if instrumented with piwik or eventlogging. To me, 100k requests for the month sounds small, and maybe it would be ok to use piwik (btw, they renamed it to matomo recently). But if that's a concern from the ops side, which Nuria knows better, and this data is absolutely necessary, then eventlogging is the only alternative.

Nuria added a comment.Oct 24 2018, 6:25 PM

Clarifying: Talked to @Prtksxna and team is actually working on a static micro-site, url pending.

Hey @Nuria, the URL is going to be bienvenida.wikimedia.org.

Prtksxna updated the task description. (Show Details)Oct 30 2018, 11:34 PM

Ok, @Prtksxna let us know when it is live and we can test the snipet, just created a site:

<!-- Matomo -->
<script type="text/javascript">

var _paq = _paq || [];
/* tracker methods like "setCustomDimension" should be called before "trackPageView" */
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
  var u="//piwik.wikimedia.org/";
  _paq.push(['setTrackerUrl', u+'piwik.php']);
  _paq.push(['setSiteId', '18']);
  var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
  g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s);
})();

</script>
<!-- End Matomo Code -->

Thanks @Nuria! Could you point me to some documentation on how to login to and see the data in Piwik?

Ottomata claimed this task.Nov 1 2018, 4:31 PM
Ottomata reassigned this task from Ottomata to Nuria.
Ottomata moved this task from Incoming to Smart Tools for Better Data on the Analytics board.
Ottomata added a project: Analytics-Kanban.
Ottomata added a subscriber: Ottomata.
Isaac added subscribers: Dzahn, Isaac.Nov 1 2018, 8:23 PM

Hey @Prtksxna or @Dzahn -- Isaac chiming in here from Research. We've been talking with @atgo about estimating readership from the campaign (both short- and long-term). I wanted to clarify something:

Will either user-agent + client-IP information be collected on the bienvenida landing page and/or a wprov parameter be passed when clicking through to any of the es-wiki links? This information would be necessary for analyzing what pages are visited by readers reached through this campaign and estimating how many of them stick around.

Thanks!

Dzahn added a comment.EditedNov 1 2018, 11:11 PM

Here's an update on progress for the micro site: T207816#4714643

There are some limited Apache logs on the backend that do appear to have user-agents and client IPs but i don't know (yet) if those would be accessible to the analytics tools used.

I can't speak for what Piwik collects exactly, but i know the Piwik javascript is in the content repo.

Nuria added a comment.EditedNov 1 2018, 11:51 PM

@Isaac: piwik provides limited set of werbstats, you can see them by accessing http://piwik.wikimedia.org, after ldap authentication there is a user/password that your team knows about and will give you access to the stats for http://resercah.wikimedia.org which are identical to the ones that will be harvested for this site.

This information would be necessary for analyzing what pages are visited by readers reached through this campaign and estimating how many of them stick around.

In order to calculate how many clicks on , say, spanish wikipedia come from the "bienvenida" site you would not be needing piwik data, referrers should be sufficient if they are set up correctly.

Isaac added a comment.Nov 2 2018, 4:56 PM

Thanks @Dzahn @Nuria. Yeah, looks like piwik has the data if needed and good point about paying attention to the referrer within webrequests. the landing page also invites readers to install one of the mobile apps, which might make some of these analytics more difficult, but i suppose that's a good challenge to have if many people are downloading the app :)

Nuria moved this task from Next Up to In Progress on the Analytics-Kanban board.Nov 2 2018, 4:56 PM
Nuria set the point value for this task to 1.

YAY! Thank you!

Nuria added a subscriber: Dbrant.Nov 2 2018, 6:22 PM

@Isaac we probably already have an apps intsall report per referrer, I would contact @Dbrant on that (for android installs)

@atgo: we will wait for at least a day to make sure data is flowing into piwik, looks like snipets are set up.

@Prtksxna I think site needs referrer meta tags:

<meta name="referrer" content="origin">
or by link:

<a href="http://example.com" referrerpolicy="origin">

See: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy

The one we use in wikipedia pages is "origin-when-cross-origin"

Change 471495 had a related patch set uploaded (by Prtksxna; owner: Prtksxna):
[wikimedia/campaigns/eswiki-2018@master] fix: add <meta> tag for referrer

https://gerrit.wikimedia.org/r/471495

Change 471495 merged by Prtksxna:
[wikimedia/campaigns/eswiki-2018@master] fix: add <meta> tag for referrer

https://gerrit.wikimedia.org/r/471495

Prtksxna updated the task description. (Show Details)Nov 3 2018, 2:43 PM
Prtksxna removed a project: Patch-For-Review.
Nuria added a comment.Nov 5 2018, 4:00 PM

There was no data on report, corrected what i think was a typo. let's wait 1 day to see.

@Nuria I'm having trouble logging into Piwik. I'm using the same
credentials that I do for Turnilo / Wikitech. Is that right?

Nuria added a comment.Nov 7 2018, 4:01 PM

@atgo: piwik requires same user/pw than turnilo. You should see a pop up with the usual ldap user/pw box.

If that works piwik requires an additional user/passwordl (same for everyone) and you can ping me about that one.

Thanks @Nuria. Confirming that I have access and see data flowing through. Just sent this to you by email, but commenting here as well for posterity.

One question. How do I see aggregate clicks to each of the instrumented links? I can see in the "Visitors in Real-Time" widget that it's tracking clicks out, but I'm not sure how to dig through that data.

Nuria added a comment.Nov 7 2018, 10:59 PM

@atgo: i think you need to talk to @Prtksxna there are 2 visits on the site as of now so there is not much data to look at. Not sure if site is active for users yet.

Nuria added a comment.Nov 7 2018, 11:06 PM

Data for events appears on "events" widget which is currently empty.

Dzahn added a comment.Nov 8 2018, 5:29 AM

Not sure if site is active for users yet.

Technically it's live. Nothing should hold users back besides.. knowing the URL exists.

Nuria added a comment.Nov 8 2018, 3:55 PM

Sounds fine, traffic is just real small, < 10 users per day.

That traffic is probably mostly me :P

We aren't promoting the page yet. That'll go live next week (likely the
14th). @Prtksxna is on leave this week returning next, so I'm sure I'll get
concrete answers on Monday.

Cheers

Nuria moved this task from In Progress to Paused on the Analytics-Kanban board.Nov 9 2018, 11:49 PM

Same as @atgo, I am able to see the outlink actions in the individual report, but the Outlinks report says that there is no data -

Is this just because there is very little data? How can I make sure that I haven't made a mistake somewhere? 🙇🏽

Nuria added a comment.Nov 13 2018, 6:00 PM

Once data comes in you can see how you instrumentation is working @Prtksxna

@Dzahn sometimes I'm getting a Bugzilla page instead of the Bienvendia page. See this screenshot:

Right now that's the only page I'm getting. Have tested across multiple devices.

@atgo Try again now. It should be better.

@Dzahn still not working after clearing cache on my machine. Any chance
it's office wifi that's caching and causing the problem? I could follow up
with IT.

fixed for real after bblack restarted a backend server and also purged the cached contents

atgo closed this task as Resolved.Nov 13 2018, 11:57 PM

Thank you!

Change 473306 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] bienvenida: add cache-control headers with max-age 1 hour

https://gerrit.wikimedia.org/r/473306

Change 473306 merged by Dzahn:
[operations/puppet@production] bienvenida: add cache-control headers with max-age 1 hour

https://gerrit.wikimedia.org/r/473306

@Nuria @Milimetric - wondering if there's any way to see behavior of visitors coming from a specific source (in this case a social media platform). We'd want to know, for example, if YouTube visitors clicked through consistently but Facebook visitors didn't so that we could modify our campaign strategy.

I could imagine doing this through aggregated sessions or through a custom link (like we can do with wprov). Do either of those work here?

Nuria added a comment.Nov 14 2018, 3:20 AM

@atgo,: piwik provides by default stats of the website you are coming from if your browser is sending the referrer

@Nuria Yeah, I see that. But can I then see the connection between those people and how they behave when they're on the site?

@atgo: anything you want to see beyond vanilla analytics you need to instrument for. Please see: https://matomo.org/docs/tracking-campaigns/

Thanks @Nuria. It sounds like @Prtksxna isn't sure if he's instrumented correctly.

Same as @atgo, I am able to see the outlink actions in the individual report, but the Outlinks report says that there is no data -

Is this just because there is very little data? How can I make sure that I haven't made a mistake somewhere? 🙇🏽

Looks like @Prtksxna resolved this, just closing the loop.

@Nuria one question remains - it looks like the totals are updated on a delay. Prateek has looked for documentation but not found it. Do you know if this is normal?

Nuria added a comment.Nov 16 2018, 5:29 PM

Piwik data is updated once a day.

Nuria added a comment.Nov 17 2018, 1:04 AM

I can see data climbing up but your event widget is empty, please take a look. Bulk of traffic comes from fb mobile

Isaac added a comment.Nov 20 2018, 8:59 PM

Checking in here on conversions from the landing page. I ran a quick query to see how things were going from yesterday. Found very few pageviews to es-wiki w/ bienvenida.wikimedia as the referrer. Any sense from those who have access to Piwik/Matomo of how many people are reaching the bienvenida landing page and whether what we're seeing here matches the logs there?

SELECT date, COUNT(*) as requests FROM
    (SELECT CONCAT(year,'-',LPAD(month,2,'0'),'-',LPAD(day,2,'0'), ':', LPAD(hour,2,'0')) AS date
    FROM webrequest
    WHERE year = 2018 AND month = 11 AND day = 19 AND is_pageview = true
        AND referer LIKE "%bienvenida.wikimedia%" and pageview_info['project'] = 'es.wikipedia') as reqlist
GROUP BY date
ORDER BY date
LIMIT 1000;
date             requests
2018-11-19:03    1
2018-11-19:07    1
2018-11-19:15    2
2018-11-19:16    1
2018-11-19:17    5
2018-11-19:20    1
2018-11-19:23    1
Nuria added a comment.Nov 20 2018, 9:08 PM

Traffic is very small to the site about 300 people per day max.

Isaac added a comment.Nov 20 2018, 9:23 PM

thanks @Nuria! so that puts it at about 4% from this snapshot that are clicking through the landing page to Wikipedia (mixture of the main page and the Música de América Latina article). Hopefully some small percentage downloading the apps too though haven't looked at data there.

I've asked the social media folks to give me the data for clicks that
they're seeing on those platforms to compare to what we're seeing in Piwik.

leila added a subscriber: leila.Nov 26 2018, 4:54 PM