I don't see this as an urgent priority, although planning it as a small piece of work for a future quarter would be fine. We could then share this with the mailing lists and contacts we have at places where people are employing these sorts of algorithms in their own code.
This sort of algorithm is in use in several prominent high scale media properties, but people are recreating the work in their specific cases, as opposed to having one easy-to-call API that reflects this line of thinking. The idea was to expose something that, given a title, produces the correct revision. I strongly agree that it should also take into consideration whether that last correct revision is reportedly non-damaging (and scrub backwards further if so), as sometimes humans can't keep up with the backlog.
Thu, Feb 21
@BBlack in https://gerrit.wikimedia.org/r/490120 I checked in with @Pginer-WMF today. Pau said deploying this the day or two prior to ExternalGuidance being activated for the source wiki of enwiki (for Indonesian) would be ideal.
Wed, Feb 20
Okay, so based on @TheDJ feedback @JMinor it seems the issue may surface in fleeting edge case scenarios. Of course some would say making those errors visible is a feature, not a bug. Anyway, obviously this is your area for prioritization.
Thanks. I'm not sure if something changed in a Scribunto module or somewhere in extension land, but it doesn't seem like it's really turning up on enwiki source, at least - there are some Village Pump discussions on this.
Thu, Feb 14
Hi all - I was aware of this task but hadn't been following it. But it was brought to my attention as having some momentum, so here I am! I have some information I can dredge up that I think may help shed some light on some paths forward. I also want to check in with some product and design people about any sense on forthcoming product interventions in the area of interactive or, for that matter, materialized graphs.
Wed, Feb 13
For those following along, I ran a query to get a sense of global usage of Google Translate and using the "Desktop" link. On 11 February 2019 there were only 89 such requests globally, about 2/3 where enwiki was the source wiki. This figure is not a perfect predictor of desktop user behavior, as for desktop users using enwiki as the source wiki receiving the mobile treatment it will be a new thing. But it probably suggests that, in addition to the rationale @Pginer-WMF provides about the basis of stopping showing broken stuff, the mobile read view is okay for consumptive purposes in general.
Thanks, @santhosh !
Tue, Feb 12
@BBlack ^ would you please review the enwiki VCL patch? We'll only want to merge it after ExternalGuidance has been tested with simplewiki and @Pginer-WMF has given the greenlight, but I figured it best if we go through review ahead of that.
@santhosh ^ would you please review and verify it has the intended effect? I need to reset my Vagrant stuff, but figured this was simple looking enough to post a patch (we'll see if I'm right!).
Mon, Feb 11
Heads up @chelsyx: for simplewiki access via the Google Translate proxy the traffic pattern is now mobile web based even for desktop UAs. The same will happen with enwiki when we make that change later. I thought I should make this clear for any intervention analysis.
@santhosh and @Gilles the footer list containing the "Desktop" link and other list items places the dot character between elements using an li::after pseudo-element. Do you think we should just use JS to remove the "Desktop" <li> instead of using a CSS rule? Setting the opacity to 0 like the other hidden elements would leave the dot character for any preceding bullets in place, which looks unusual because it leaves a dot at the end of the list. If we use JS is there a preferred segment of the JS code to do so to avoid any performance issues?
Sun, Feb 10
Fri, Feb 8
Jan 18 2019
Heads up @phuedx . @BBlack and I spoke yesterday and we'll go with a simpler patch instead of the fuller refactor, given the plan to have the Varnish stuff in maintenance mode and switch to ATS (i.e., don't fix it if it ain't broken).
Okay, @BBlack, now it's ready for review.
@BBlack hold that thought, one more condition to add.
@BBlack patch posted for your review ^. Would you please review and let me know on patch for any additions?
Jan 9 2019
Hi @BBlack , any suggestion here?
Jan 8 2019
@Tbayer what do you have in mind? Heads up, T208795 captures the first concrete case where the full transcoding indeed goes all the way through the Wikimedia servers and stuff is already counted as a pageview but there's an X-Analytics key-value made available for query purposes.
Jan 7 2019
Paraphrasing a dialogue with @BBlack immediate edge side HTTP redirects based on header/regex might be feasible without fragmenting caches/backends.
Dec 5 2018
Nov 28 2018
Nov 15 2018
Nov 11 2018
Nov 9 2018
Nov 7 2018
@Nuria thanks. You understood the question well. Okay, so my read of sessionInSample and randomTokenMatch is that the populationSize values between different schemas would need to have a common base value so that they divide cleanly in order to guarantee intersection, as it's a divisor in a modulo calculation. Do I have that right?
Nov 6 2018
The question of whether you can sample events per session with stickiness is a different one, and the answer to that is yes, you can do that as of today deterministically and decide that event 1 and event2 are always going to be sampled for session "25". Session here means " identifier assigned to your browser until you close it down" . This identifier is sent in eventlogging events but it is not sent in general requests. It will be reset when you re-start your browser.
- IE11+ (6.8%)
- Safari 5.1-11.2 (1.7%)
- iOS Safari 8-11.3
- After a brief peak at pageviews_daily in Turnilo, this looks like ~0.7%
Yes, we discussed collision avoidance as part of T201124 and increased the length of mw.user.sessionId() to a value that should be safe for all foreseeable scenarios (see in particular T201124#4521002). I'm not quite sure what salting and hashing has to do with that though.
- For unique device:
- (an example that can include both scenarios you mentioned above) Any kind of experiment or data collection that requires asking the same unique device multiple questions across a period of time. For example, when we want to learn about how users "learn" on Wikipedia, we need to be able to interfere with their experience on Wikipedia in multiple stages of their interaction and ask them questions. Not being able to say which unique device has answered the first batch of questions is a blocker for this line of research.
Thanks for the review. The User-Agent field is that of the end user's device.
Nov 5 2018
Nov 2 2018
Oct 31 2018
Follow up here: Kosta and I spoke, and we don't need the token, as logging should take place on a per-user basis, not just on a per-session basis. So the key will be constructed by hashing two non-sensitive items. This is an okay approach in my view given the requirements.
@Bawolff I added a question in the patchset about getToken(). Basically, although the cost of computing a rainbow table to reverse engineer the hashed values of getToken() in case of someone spilling Redis keynames is moderately high, I wanted to check whether there's even a risk if an attacker does so. If there's a risk if an attacker does so, I'm thinking we should instead take just a portion of the token (I'm working from the assumption this is tied to something fixed between the client and the server - a cookie issued post login) and the user's numerical ID and concatenate those and then hash that concatenated value for setting the keyname - that would still be basically collision free for keynaming purposes.
Oct 30 2018
@leila to clarify, which of the following do you desire?
Oct 23 2018
Oct 21 2018
@Tbayer @Neil_P._Quinn_WMF @chelsyx @mpopov @nettrom_WMF curious about your thinking here for session overlap between events that are sent at the global (perhaps per-project, if we need that) default and those that are oversampled for the sessions.
Oct 18 2018
@phuedx do you think it might be sensible to simply make sendBeacon a pre-requisite at this point for client side event logging?
@Ottomata I agree with @phuedx on your question that opt-in (eventually via SCS) makes sense. After all, for feature teams or feature clusters where session sampling as the norm would be wanted, they could follow some convention of their own to make it simple.
Oct 12 2018
I'm interested in filling out the TODOs here.
Oct 11 2018
The understanding from @Tbayer is correct about this task being separate from the question on retention beyond 90 days.
Oct 10 2018
Sep 26 2018
Sep 14 2018
Aug 29 2018
This appears to be related to fully qualified links in the page. Links to [[articles]] are automatically rewritten for mdot if on mdot already IIRC.
Aug 2 2018
That's fine. I have some suggestions on the content on https://wikimediafoundation.org/technology/ and https://www.mediawiki.org/wiki/New_Developers#Choose_a_software_project , but I'll take that to an email thread.
Jul 30 2018
Thanks, @Rfarrand for closing. Further summary posted at https://www.mediawiki.org/w/index.php?title=Wikimedia_Developer_Summit%2F2018%2FKnowledge_as_a_Service&type=revision&diff=2840141&oldid=2699409
Jul 27 2018
Jul 26 2018
Jul 20 2018
Oh, interesting: https://developer.apple.com/videos/play/wwdc2018/204/
Jun 25 2018
Marking this as resolved. Thanks @mforns for the review.
Jun 21 2018
Jun 19 2018
Jun 14 2018
Jun 12 2018
Jun 11 2018
May 31 2018
Thanks, all. It's working for me now.
May 30 2018
May 10 2018
Nah, I'm good. I seemingly set this to Declined after thinking about it, but never actually submitted the change. Thanks for checking.
May 2 2018
Understood it's low priority. Just adding another URL to the heap: https://www.mediawiki.org/wiki/Thread:Project:Support_desk/Adding_custom_style_for_special_pages/reply_(2)
May 1 2018
Any update here?
Apr 18 2018
The lack of tagging did not appear to be related to any config blob change on the portal at Zero:-OPERA (the last change was too far back for that to be plausible).
Apr 12 2018
It's too tall an order for today - I'll need to be tracked down should this be resurfaced.
Hi all, I apologize for not replying on this ticket. In practice while I know of good solutions here, it's moot given current resourcing and projected projects. Now, if this becomes an area for further prioritization as part of security/privacy initiatives, I'm happy to consult.
Apr 5 2018
The other night I woke up in the middle of the night thinking that I need to follow up here. Thanks @Rfarrand for the follow up on Phabricator!
Apr 4 2018
I can't speak to all of the particulars, but I've added Roan and Joe to this task. Roan and Corey have been in discussion about technical architecture matters concerning push architecture as I understand. Web push is presently part of the FY 18-19 plan.
Apr 3 2018
@Dbrant was there a patch or something in support libraries that would have fixed this?