Page MenuHomePhabricator

Implement Last-Access cookie [34 pts] {bear}
Closed, DeclinedPublic

Description

  • Code changes
    • vcl code to set cookies/cookie
    • vcl code to modify x-analytics header to be send to kafka
  • testing varnish changes on beta labs.
    • Involves getting permits for several machines on beta labs and talking to release team
    • testing x-analytics headers on cluster, make sure counter is pass through

https://wikitech.wikimedia.org/wiki/Analytics/Unique_clients/Last_access_solution#How_will_we_be_counting:_Plain_English

Event Timeline

kevinator renamed this task from Implement Last Visited cookie to Implement Last Visited cookie [34 pts] {bear}.
kevinator raised the priority of this task from to High.
kevinator updated the task description. (Show Details)
kevinator set Security to None.
kevinator added subscribers: kevinator, Aklapper.

I think this idea needs wider discussion. Adding a durable tracking token to all visitors is a major shift in stance for the Foundation. In several jurisdictions I believe we would be obligated by local law to display a prominent opt-in banner before setting such a tracking beacon. In an opt-out scenario this would also require a durable token to keep the tracker from being reinserted. Depending on how you view this, having an opt-out cookie is just as bad as having a unique tracker in the first place. If nothing else it would single the visitor out as different from the norm and thus provide some additional entropy for fingerprinting.

I'd like to see an open discussion of this story including myself, @tstarling, @csteipp, and a representative from WMF Legal at least before anyone goes to far into implementation.

When I talked to @Halfak, he was thinking they would set a cookie that was not unique, that expired at the end of the month. So everyone during the month gets the cookie, and they count how many times they saw someone without the cookie.

I'd prefer we don't set cookies for anons, but (imho) this is better than trying to track uniqueness on the backend and potentially be storing click paths of users.

The parent (T88647) does say "unique identifiers will not be used for this report".

The parent (T88647) does say "unique identifiers will not be used for this report".

So it does. I should read things more closely apparently. I had a voice conversation that I twisted a bit in my head apparently.

@Halfak and @kevinator my apologies for introducing confusion.

Will the intended cookie just be a flag that expires at the next month boundary or will it actually include a date/timestamp of when the cookie was set? If you just want monthly unique counting per wiki it seems like you could be as terse as Cookie: v=1; Expires=Mon, 1 Mar 2015 00:00:00 GMT; Path=/; HttpOnly.

Ok, it took me a little bit to wrap my head about what y'all are doing with this, and was kind of skeptical. Assuming I understand this correctly, this sounds like a very clever solution that preserves anonymity. @csteipp, I'd love to talk to you at some point about what your concerns are (verbally if you aren't comfortable documenting it here).

In general, it seems as though the only possible change to privacy is that, if the client computer is seized or compromised, someone would have a record of whether or not they visited one of our sites, and which one it was they visited. Nothing about the specific page, or how many times they visited, or when they visited at a granularity greater than monthly. On the server side, the only thing we get out of this is unique visitors.

This seems pretty clever and awesome to me. Great idea!

In general, it seems as though the only possible change to privacy is that, if the client computer is seized or compromised, someone would have a record of whether or not they visited one of our sites, and which one it was they visited. Nothing about the specific page, or how many times they visited, or when they visited at a granularity greater than monthly. On the server side, the only thing we get out of this is unique visitors.

Yep, that's the only real impact I see on a person's privacy as a result of this.

On the backend, we need to be vigilant about not storing data that can be used to do unique identification. If this cookie allows us to drop any cookies/headers/UA's that we're storing, I think it's a step in the right direction.

kevinator renamed this task from Implement Last Visited cookie [34 pts] {bear} to Implement Last Access cookie [34 pts] {bear}.Mar 12 2015, 3:45 PM
kevinator renamed this task from Implement Last Access cookie [34 pts] {bear} to Implement Last-Access cookie [34 pts] {bear}.

@RobLa-WMF:

Ok, it took me a little bit to wrap my head about what y'all are doing with this, and was kind of skeptical. Assuming I understand this correctly, this sounds like a very clever solution that preserves anonymity.

We think so, credit for cleverness of solution goes to @Halfak

Change 196009 had a related patch set uploaded (by Nuria):
Adding a Last-Access cookie to text and mobile requests

https://gerrit.wikimedia.org/r/196009

Since this ticket has a bit wider audience: can you guys take a look at the lower-level questions about the cookie/data work in: https://phabricator.wikimedia.org/T92435#1231133 ?

Change 196009 merged by BBlack:
Adding a Last-Access cookie to text and mobile requests

https://gerrit.wikimedia.org/r/196009

kevinator claimed this task.

This task was further broken down, and is now invalid. Look for tasks marked {bear} for related project tasks.