Page MenuHomePhabricator

Add fromWelcomeTemplate=1 query parameter to Czech and Korean welcome templates
Closed, ResolvedPublic

Description

In the work on T205754: [EPIC] Growth: Understanding first day, we are adding logging to page views. We would like to understand when users are clicking on links provided by the Czech welcome template and Korean welcome template.

What we would like to do is append a query parameter to all links that are included on those welcome templates. The query parameter will be the same for each link, and it should read fromWelcomeTemplate=1. This way, in our EventLogging data for the Page Views schema, we'll be able to clearly see when users are visiting help resources from these links.

As an example, in wikitext, this looks like [{{fullurl:Speciální:Userlogin|fromWelcomeTemplate=1}} vytvoření účtu]. That renders to https://cs.wikipedia.org/w/index.php?title=Speci%C3%A1ln%C3%AD:Userlogin&fromWelcomeTemplate=1

@revi and @Urbanecm please let me know if you will need more information from us on this.

Event Timeline

@kostajh -- could you fill this out in the next couple days so that we can ask our ambassadors to get started? They might need a little time to discuss with their communities.

@MMiller_WMF sure thing. Do you happen to have links to their welcome templates?

kowiki one provided by Urbanecm is correct one.

</workmode>

kostajh renamed this task from Placeholder task for modifying welcome templates on Czech/Korean wiki to include a query parameter to Add fromWelcomeTemplate=1 query parameter to Czech and Korean welcome templates.Oct 17 2018, 1:42 AM
kostajh removed kostajh as the assignee of this task.
kostajh updated the task description. (Show Details)

There is one little problem that didn't come to my mind during the check-in. We do substitute templates that communicate with users (this policy was estabilished to keep user talk pages archives really archiving and not adapt to current template) in cs. Hence, if I modify the template, it will affect only users that were welcomed since I do the modification. There are two ways to deal with it:

A) accept it as a fact and take date of welcoming into mind when analyzing the data.
B) Install site-wide JavaScript to add the queryparameter on the fly (should be pretty easy to write)

What solution should I take?

Korean Wikipedia exclusively uses the internal link ([[Blah|Meh]]), does this mean we have to convert it to [https://(snipped) Meh]?

PS: Please notify me of the pings or todo on the comments instead of on description (email does not provide a diff for the description update), thanks!

Trizek-WMF added a subscriber: Trizek-WMF.

(Claiming task to make Phabricator happy, while I'm coordinating the effort.)

@Urbanecm

There is one little problem that didn't come to my mind during the check-in. We do substitute templates that communicate with users (this policy was estabilished to keep user talk pages archives really archiving and not adapt to current template) in cs. Hence, if I modify the template, it will affect only users that were welcomed since I do the modification

That is OK, we will only analyze data after this change is implemented (and after the PageViews schema goes live).

Hi @revi

PS: Please notify me of the pings or todo on the comments instead of on description (email does not provide a diff for the description update), thanks!

Will do!

Korean Wikipedia exclusively uses the internal link ([[Blah|Meh]]), does this mean we have to convert it to [https://(snipped) Meh]?

Yes, sorry for the inconvenience. From looking at https://en.wikipedia.org/wiki/Help:Wikitext#Links_and_URLs I could not see a way to add a query parameter when links are formed like [[Blah|Meh]]. You'll need to do something like what is described in the task description. See https://en.wikipedia.org/wiki/Help:Wikitext#Variables for more info. Please let us know if you need help or have questions.

I know how :) I just wanted to check if I need to make additional span tag to make it look like an plainlinks.

I know how :) I just wanted to check if I need to make additional span tag to make it look like an plainlinks.

<div class="plainlinks">[https://(snipped) Meh]</div> is indeed the solution.

@Urbanecm

There is one little problem that didn't come to my mind during the check-in. We do substitute templates that communicate with users (this policy was estabilished to keep user talk pages archives really archiving and not adapt to current template) in cs. Hence, if I modify the template, it will affect only users that were welcomed since I do the modification

That is OK, we will only analyze data after this change is implemented (and after the PageViews schema goes live.

Ok then.

Btw, is there a reason to not use reffer? We can give you a list of pages linked from the template and you can count views with reffer set to a talk page as welcome template click.

Btw, is there a reason to not use reffer? We can give you a list of pages linked from the template and you can count views with reffer set to a talk page as welcome template click.

We opted not to include referer because it would be difficult to sanitize properly for sensitive namespaces. For example, if the user ends up on your Help Desk and they came from their User Talk page, then it's not a problem to include referer in our schema, but if they came from /wiki/SomeSensitivePage we would not want to include that in the referer.

The query parameters on the URLs helps us verify that people are clicking the links in the template and not navigating to the welcome template links through the menu or search bar.

We opted not to include referer because it would be difficult to sanitize properly for sensitive namespaces. For example, if the user ends up on your Help Desk and they came from their User Talk page, then it's not a problem to include referer in our schema, but if they came from /wiki/SomeSensitivePage we would not want to include that in the referer.

Understood. Theoretically you can just not store referer, but some value derrivated from it - like source page and store it only in cases it is in whitelisted namespaces. But the choice is up to you, I was just wondering.

The query parameters on the URLs helps us verify that people are clicking the links in the template and not navigating to the welcome template links through the menu or search bar.

Ehm, how? The current solution doesn't allow you to see difference between a link with query parameter linked from template and from substituted template on user talk page. If you want it to be working like that, it'll require more logic than I can see in the task's description.

In that case, a link should be something like this: <span class="plainlinks">[https://cs.wikipedia.org/wiki/ExamplePage{{#ifeq:{{NAMESPACENUMBER}}|3|?fromWelcomeTemplate=1}} ExamplePage]</span> (if you want substitution to be working, it should be something like <span class="plainlinks">[https://cs.wikipedia.org/wiki/ExamplePage{{su<includeonly />bst:#ifeq:{{su<includeonly />bst:NAMESPACENUMBER}}|2|?exampleQueryParameter=1}} ExamplePage]</span> for a single link, which is something that takes more than 1,5 rows on my 1080px screen. That's way too long for one link that usually takes as many space as the page title plus 4 (non-query equivalent of wikitext I just wrote is [[ExamplePage]]).

I think there should be better way than this. Using this solution would make the template way more complicated than MediaWiki templates usually are. Maybe something like the external link solution allowing to pass query parameters easily?

To be discussed in our next meeting: this will ease our conversation.

To be discussed in our next meeting: this will ease our conversation.

Decision: Those who were present on that meeting (me, @MMiller_WMF, @Trizek-WMF, @kostajh and @aripstra) have no problem with another solution than modifying the template, as long as it will get the query parameter to the server.

Additional question: Do we want to track only links from Czech Czech (doubling is intentional) welcome template? We also have Czech English welcome template and Czech Russian welcome template (welcome template written in English/Russion used on cs).

As I said on the meeting, this is JavaScript-based solution I propose to use. Steps to verify its working:

  1. Make sure you're logged onto wiki
  2. Go to Special:MyPage/common.js
  3. Copy&paste code from below and save it page
  4. Go to https://cs.wikipedia.org/wiki/Diskuse_s_wikipedistou:Martin_Urbanec_(test) and check. It might not work when you try it for first, in that case, please empty your cache (use Ctrl+Shift+R).

The only one difference is the query parameter is fromWelcomeTemplateTest=1.

JS
// T206882 - add query parameter to links in welcome template clicked on user talk page
if(mw.config.get('wgNamespaceNumber') == 3)
{
	var links = $('.welcome-template').find('a');
	for(var i=0; i < links.length; i++)
	{
		if(links[i].classList.contains('external') === false) links[i].href += "?fromWelcomeTemplateTest=1";
	}
}

When we'll decide we want to use it, me and -revi (both should have access) can paste it to MediaWiki:Common.js on-wiki, make sure correct query parameter is used and save it. The code will be loaded and executed by all wiki-users (both logged in and anonymous). After the end of this task, the change just will be reverted and the query parameter will disappear forever.

I don't feel comfortable using my volunteer account for work reasons (already discussed with Benoit and Marshall this week) and even if I felt comfortable, Korean Wikipedia currently has problem with IntAdmin policy and as a result there is no IntAdmin. (And I'm not allowed to use Steward access there because kowiki is homewiki)

@revi: Ah, didn't know that. Thank you for clarification. This would be doable by somebody with staff-level permission then. Even in that case, it depends on actual solution that will be chosen - this is just proof of concept. I didn't felt comfortable complicating the welcome template more than it's necessary :).

This would be doable by somebody with staff-level permission then.

That can only be done after informing the community in any case.

I didn't felt comfortable complicating the welcome template more than it's necessary :).

Well, subst-ing a template on a user talk page makes that page very complicated when you edit it. So, as I told you, add more things in the message will not really complicate it more. :)
The best solution would be to transclude that template, and I still haven't understood why it is not possible (like it is done on ko.wp, so there is no need to use that JS on that wiki).

This would be doable by somebody with staff-level permission then.

That can only be done after informing the community in any case.

Of course, but I talked about technical solution only.

I didn't felt comfortable complicating the welcome template more than it's necessary :).

Well, subst-ing a template on a user talk page makes that page very complicated when you edit it. So, as I told you, add more things in the message will not really complicate it more. :)

Well, if you have a very lenghy template, it can be very difficult to edit it - it can be more complicated than without adding things that makes the template lengthy. Imagine somebody adding a link _after_ template modification. Then, he need to be aware about this - just a bit of complication. Nobody would add a link in an unusual way like this and nobody examine history before changing a page.

The best solution would be to transclude that template, and I still haven't understood why it is not possible (like it is done on ko.wp, so there is no need to use that JS on that wiki).

If you have 3 months or more (and perhaps funds to pay my time I spend on organizing the discussion) and if you accept the risc of consensus refusing the change, then it can be. It is very difficult to change such widely accepted consensus.

If you substitute a template, then its code goes to the page it was used in. When the template is changed, it does not touch the places it is transcluded in. The point is that welcome templates are in talk page archives (currently substituted, but you propose transclusion, so let's imagine they are transcluded). The user who inserted the template wanted to show the user what the template contains, not what it will contain in the future. So, the archive won't be precise record of conversation.

Also, imagine a newbie wanting to re-look on the template after it changed. I think it can (and will) be confusing for them.

I remember a similar discussion volunteer-me has been part of on fr.wp. IIRC, we have decided to go for not subst-ing the template because we are sure of one point: any improvement made on the template is done for the best and better than having an old subst-ed out-of-date message for returning users. We have a bot distributing that template, subst-ing it because of a parameter. This is something we need to fix that is in our to-do list since a while.

Anyway, I understand your reasons as well. I'm pretty sure that opening the conversation just to know opinions (not decide) about the distribution of the template unsubsted would be okay. When people have to decide about something that is done but should change for good reasons, they can surprise you sometimes. :)

Quoting mail sent to @MMiller_WMF and follow-up conversation which happened yesterday (and including my (in-Phab) reply)

Could we please move this discussion into Phabricator? We are getting concerns from WMF engineers about user privacy and why were are trying to track external links to begin with. It would be helpful to have some context available for others.

Sure, moved.

Kosta

On Oct 18, 2018, at 6:48 PM, Marshall Miller <mmiller@wikimedia.org> wrote:

+ other team members (Morten, this is coming from the check-in call with Martin this morning, in which we discussed whether it is worthwhile to instrument the "Kurzy" button at the top of Czech Wikipedia, which means "Courses". Fortunately, he has already instrumented that button, but not with User IDs. He sent us over the counts.)

Just for context: This is built into (almost?) every webserver, WMCZ's webserver's logging wheather it response to a web request, together with IP address, URL and a few of other fields and if I count number of requests to nastenka.wikimedia.cz/?source=cs.wikipedia-menu, I have the requested data. I'm not intentionally "spying" Wikipedia users, I won't do it if I don't have written permission from WMF in advance :). This is just using of something I must have

Thanks for putting this together, Martin. Here's how I'm thinking about it:
Over July 2018 - October 2018, about 4,000 new accounts were created in Czech Wikipedia.
During that time period, 531 distinct IP addresses clicked. Some caveats:
There may be some IP addresses that have many clicks on Kurzy because many people are all at a course together, at the same IP address.
Many of them are probably not logged in.

Probably not, anonymous users should not see this link.

Some of them are probably just aimless readers, who click on random stuff, and have no intention of editing.
Given all that, we do not know whether 10% of the 531 clicks are new editors, or 90%.

It was included for instructors to have a replacement for the old Courses link inserted by MediaWiki-extensions-EducationProgram, which is nowadays depracated extension.

So do we think the number of people clicking on that link is significant? I think maybe yes. We would have to record with usernames to find out.

Of course, there are two ways:

  1. Don't use extra solution (which can be overkill in case we want to track only this link) and simply add other query parameter "username". I will be able to give you the data. This will be easier, but I'm not sure if that's legally acceptable, as it would mean external organization (chapter, as the redirect is controlled by WMCZ) is collecting data about Wikipedia users.
  2. Use something WMF-side, which would be probably more okay privacy-side and probably easier extendable to other links if desired.

This all may not be compelling enough for us to pursue this particular instrumentation. But let's discuss for a moment.

On Thu, Oct 18, 2018 at 10:25 AM, Martin Urbanec <martin.urbanec@wikimedia.cz> wrote:
Hey Marshall,

as you requested in our meeting, I have gathered the data you want to have. Here it is (everything is from July 2018-October 2018):
Clicks total: 721
Clicks per day: approx. 6.8
Number of unique IP addresses which clicked at least once on the link in total: 531
The above, but per day: 5.1
I should note 554 clicks (from clicks total) were performed while the user was on the main page, which is 76 %. Hypothese: Newbie mistakens it with a list of courses they can attend.

Martin

@Urbanecm -- thank you for producing these counts and the explanation. I didn't realize that anonymous users don't see the link. Since we're still thinking about whether it's important to learn more about the usage of these external links, I think that's all the counts we'll need for now. If we decide we want to learn more, we'll be careful to think through privacy implications together, and with others at WMF.

I remember a similar discussion volunteer-me has been part of on fr.wp. IIRC, we have decided to go for not subst-ing the template because we are sure of one point: any improvement made on the template is done for the best and better than having an old subst-ed out-of-date message for returning users. We have a bot distributing that template, subst-ing it because of a parameter. This is something we need to fix that is in our to-do list since a while.

Seems both wikis were solving same issue in the past, but with different outcome. This is something I quite like in Wikimedia Movement, everybody does the same thing - build the best encyclopedia ever, but by A LOT of different means, places and so on.

Anyway, I understand your reasons as well. I'm pretty sure that opening the conversation just to know opinions (not decide) about the distribution of the template unsubsted would be okay. When people have to decide about something that is done but should change for good reasons, they can surprise you sometimes. :)

I don't think it will change, just because of pretty high conservativeness. Its something which will be implemented as a social solution, not technical - I mean, patrollers themselves will have to welcome users in different way.

revi moved this task from Incoming to Doing on the User-revi board.

For the Korean Wikipedia, the change was deployed as of the Sunday (28th).

</workhat>

For record: I'm waiting for an instruction to deploy this change. Let me know when I should do it.

@Urbanecm -- we think you should do this, unless you have any additional questions for us.

Thanks, @revi and @Urbanecm!

Okay, @kostajh -- could you please check these changes out and make sure they are as you expected?

Here is a welcome template in Korean Wikipedia.

And here is the link to Martin's work.

Trizek-WMF reopened this task as Open.

@Trizek-WMF: Why reopening?

@kostajh: My code checks if you're on a user talk page, you can check if it is working in https://cs.wikipedia.org/wiki/Diskuse_s_wikipedistou:Martin_Urbanec_(test).

@Trizek-WMF: Why reopening?

Because that's on Code Review column and I missed that detail. We can close it when reviewed, unless if Marshall thinks that's not necessary.

@kostajh: My code checks if you're on a user talk page, you can check if it is working in https://cs.wikipedia.org/wiki/Diskuse_s_wikipedistou:Martin_Urbanec_(test).

@Urbanecm looks good, but could you please change fromWelcomeTemplateTest=1 to fromWelcomeTemplate=1? thanks!

@kostajh Ah, sorry, was testing the code with this parameter to don't accidentaly log anything wrongly. Changed.

The code has been reviewed. :)