<Org-Wide Impact> Google Chrome User-Agent Deprecation Impact
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• sdkim
	Nov 4 2021, 6:42 PM

Description

Request Status: Ad-Hoc Request
Request Type: External Dependency Change

Request Title: Google Chrome User-Agent Deprecation

Request Description: Google Chrome is changing the way it shares user-agents for increased privacy of users. You can read more about it here: https://www.chromestatus.com/feature/5704553745874944. Google Chrome has released Client Hints to provide device information. This first release “is intended to allow for developers to experiment and provide feedback”: https://groups.google.com/a/chromium.org/g/blink-dev/c/-2JIRNMWJ7s/m/u-YzXjZ8BAAJ
Indicate Priority Level: High
Main Requestors: Anti-Harassment
Ideal Delivery Date: April
Stakeholders: AHT, Analytics, Extensions, Technology, etc.

Request Documentation

Document Type	Required?	Document/Link
Related PHAB Tickets	Yes	T242825: Deal with Google Chrome User-Agent deprecation
Product One Pager	No	<add link here>
Product Requirements Document (PRD)	No	<add link here>
Product Roadmap	No	<add link here>
Product Planning/Business Case	No	<add link here>
Product Brief	No	<add link here>
Other Links	No	https://github.com/WICG/ua-client-hints

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		• DAbad	T295073 <Org-Wide Impact> Google Chrome User-Agent Deprecation Impact
Resolved		kostajh	T242825 Deal with Google Chrome User-Agent deprecation
Resolved		Tchanders	T258591 Technical investigation into an experiment for using client hints in CheckUser [8H]
Declined		None	T258592 Investigate how users perform actions logged in CheckUser, ahead of UA deprecation [8H]
Duplicate		None	T265057 SPIKE: consider problems to data pipelines as a result of reduced user agent entropy in Google Chrome
Declined		None	T296835 Update EmbedPlayer to remove the use of Navigator.userAgent
Resolved		• brooke	T296837 Investigate OGV library use of deprecated navigator.userAgent
Resolved		kostajh	T296838 Investigate and update videoJs library from using deprecated navigator.userAgent
Declined		None	T296839 [S] Update deprecated Navigator.userAgent from MultimediaViewer
Resolved		Seddon	T296840 [S] Investigate and update jQuery.lazyload library from UploadWizard from using deprecated navigator.appVersion
Resolved		kostajh	T299397 Measure user-agent client hints already sent in browsers requests
Resolved		• EChetty	T299401 VarnishKafka to propagate user agent client hints headers to webrequest
Resolved		JAllemandou	T299402 Add user agent client hints to the `webrequest` table
Open		None	T274633 Record browser fingerprints instead of pure UA for checkuser
Resolved		Volker_E	T298035 Prepare OOUI for Google Chrome userAgent deprecation
Resolved		Catrope	T298044 Prepare ResourceLoader for Google Chrome userAgent deprecation
Resolved		JAllemandou	T304850 Add CU-UA high entropy hints to Hive webrequest tables
Resolved		phuedx	T301238 UA_CH - Implement getting High Entropy Hints into webrequest logs
Resolved	BUG REPORT	phuedx	T316760 Error with Permissions-Policy header: Unrecognized origin: 'intake-analytics.wikimedia.org'
Resolved		mforns	T336084 [SPIKE] Model impact of User-Agent deprecation on top line metrics

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 4 2021, 6:42 PM

• sdkim updated the task description. (Show Details)Nov 4 2021, 6:43 PM

This project impacts any team that collects browser/device data or works with such data. Anti-Harassment Tools is one of the teams impacted -- there are likely many more including Research, Data-Engineering, Product-Analytics etc.

Google deprecating user agent which is one attribute of user request that we use. Would need to migrate away from this.

• DAbad renamed this task from Google Chrome User-Agent Deprecation to <Org-Wide Impact> Google Chrome User-Agent Deprecation Impact.Dec 16 2021, 4:35 PM

• DAbad claimed this task.

• DAbad triaged this task as High priority.

• DAbad updated the task description. (Show Details)

• DAbad set Due Date to Feb 28 2022, 5:00 AM.

2021-12-16 - Analytics Discussion

Attendees: Olja, Emil, Desiree, Danny
Expect that all of analytics will be impacted in some way or form
To assess impact work will be broken down as follows:
- Metrics Platform: (1) Reviews what new method is and how it will work, (2) Assesses impact on instrumentation
- Data Engineering - identify what would be potentially impacted within the Analytics stack

• DAbad added a subtask: T242825: Deal with Google Chrome User-Agent deprecation.Dec 16 2021, 4:40 PM

• DAbad added subscribers: • EChetty, lbowmaker, • STH.

• DAbad changed the task status from Open to In Progress.Dec 16 2021, 4:46 PM

• DAbad moved this task from Backlog to Investigate on the Foundational Technology Requests board.

Here are some search results for user agent code in MediaWiki that could possibly impact frontend features. Skins and Extensions seem to touch a few teams, and could use an audit. Thanks!

OOUI (ticketed here)
Skins
Extensions

dr0ptp4kt subscribed.Dec 20 2021, 6:22 PM

MusikAnimal reopened subtask T298044: Prepare ResourceLoader for Google Chrome userAgent deprecation as Open.Dec 21 2021, 12:15 AM

MusikAnimal reopened subtask T298035: Prepare OOUI for Google Chrome userAgent deprecation as Open.

Volker_E closed subtask T298035: Prepare OOUI for Google Chrome userAgent deprecation as Resolved.Dec 21 2021, 12:44 AM

MW impact:

Spoke with Cindy Cicalese - no obvious impact - one place in the core session code that it gets the user agent, but it does not seem to base any logic upon the result - check with Gergo Tisza
Gergo Tisza analysis - don't think the auth framework uses user agents in any way (though it logs them and checkusers and the Security team might use the data when investigating login- or signup-related abuse). In MediaWiki in general, it's used for legacy browser detection in various places. The most prominent is probably ResourceLoader which uses the UA to check whether to load JS at all for a given request, and whether to use local storage for the module cache.

MGerlach subscribed.Jan 17 2022, 1:11 PM

JAllemandou removed a subtask: T299397: Measure user-agent client hints already sent in browsers requests.Jan 18 2022, 1:42 PM

MGerlach unsubscribed.Jan 18 2022, 3:02 PM

Analytics Impact:

Currently being assessed by Data Engineering & Metrics Platform team. Current solutions being evaluated include:

Instrumenting Page views in a similar way that we currently are doing with Virtual Page Views
Replacing the User-Agent with the data we get from the User-Agent Clients Hints.

The primary difficulty with using User-Agent Client Hints is that some requests will not include them, meaning we would need to do some preflight checks before sending the event offs (Introducing latency.) However if we instrument - browsers that don't support JS would not be able to send back data (We still need to determine the proportion of our users using browsers that don't support JS). We are currently in the process of evaluating to determine which option would clause the least down stream effects.

Currently working on getting the Client Hints data to establish the relationship between the User-Agent and User-Agent Client Hints. This will allow us to determine the relationship, if any data loss will occur and find any field/value changes in the Client Hints Data. Ideally this just involves a simple change to the varnish configuration to send the new headers back to DE for analysis.

January 19, 2022 Steering Committee:**
update on progress
impact: Metrics Platform team will be shifting to focus on this effort
Olja D. - user hints implementation has 2 options pre-flight (latency penalty) or post (loss of data on first request)
Mark B. - would likely impact SLOs,
Kate C. - perhaps performance should be included
Greg G. - fundraising and advancement generally want to be informed

Action Items:

Emil & Olja submit to tech forum

Catrope closed subtask T298044: Prepare ResourceLoader for Google Chrome userAgent deprecation as Resolved.Jan 24 2022, 6:22 PM

Tks4Fish subscribed.Jan 25 2022, 4:30 PM

DerHexer subscribed.Jan 25 2022, 4:40 PM

January 25, 2022 Stewards/Tech Team Meetup: Anti-Harassment Tools and T&S
Q&A on userAgent Changes:

Is there a timeline?

The goal is to have experiments set up and running and pulling data by the end of this week. Evolving situation which will require feedback from the community and internal dependencies.

Are mobile devices models going to be requested?

It would depend on if we require high entropy headers. But yes, it is being included in evaluations. Whether we need to request additional info from mobile units and what the prioritization will be (data integrity, performance) etc.
- - High entropy headers = headers we would need to explicitly request. Low entropy are sent by default.
- in India, there is a huge pool of users in small ranges, so mobile units is going to be the difference in saying person A is not person B. That information (IE last name) would be very good for CheckUser.
  - Frequently uses update history to judge if it’s the same user. This information is helpful to CheckUser.

In low entropy headers, what information in high-level thinking is coming from low entropy?

Basically looking at browser information and current version and OS and whether it’s on mobile and/other attached models. Some info in html as well.
- Sounds like a lot of the info from userAgent. Concern is what we’re losing to high entropy and performance and the payoff.

How are we engaging with the community?

engaging with advocacy team and technical engagement, to develop comms for what is impacted that the community maintains (actively being worked on). They are proactively trying to give people a heads-up, looking at repos etc.
Ideally try to minimize impact of change, since we would try to be logging in similar ways to minimize changes. When and how we can make the changes so that it doesn’t affect as many. Looking at plans, but have started engaging folks.

dmaza subscribed.Jan 25 2022, 4:54 PM

JJMC89 subscribed.Jan 25 2022, 5:20 PM

@DAbad for posterity, can you please define MU, MR and what "IE last name" is (to avoid confusing with a human user's last name)?

In T295073#7649747, @Huji wrote:

@DAbad for posterity, can you please define MU, MR and what "IE last name" is (to avoid confusing with a human user's last name)?

This was used as an example (two humans with the same given name)

GeneralNotability subscribed.Jan 26 2022, 12:46 AM

Mayakp.wiki subscribed.Jan 26 2022, 7:19 PM

Ameisenigel subscribed.Jan 27 2022, 1:23 PM

• DAbad moved this task from Investigate to Parking Lot (Reviewed but Not Prioritized) on the Foundational Technology Requests board.Feb 1 2022, 5:07 PM

• DAbad moved this task from Parking Lot (Reviewed but Not Prioritized) to Work In Progress on the Foundational Technology Requests board.

In T295073#7649815, @jrbs wrote:

In T295073#7649747, @Huji wrote:

@DAbad for posterity, can you please define MU, MR and what "IE last name" is (to avoid confusing with a human user's last name)?

This was used as an example (two humans with the same given name)

Edited the note to add some clarity

A quick update here.

We have 3 new fields in the wmf.webrequest table: ch_ua , ch_ua_mobile and ch_ua_platform which come from the low entropy client hints. Some preliminary analysis suggests that this will not be able to replicate the information we get from the useragent and will need to continue experimenting with what data we can get from the client. This will be confirmed by the end of this week.

The next step will be to put in a plan to start capturing high entropy hints to determine if this will replicate the information we get from the current useragent. This solution (if selected as the way forward) will result in some dataloss which will need to be evaluated during the experimentation process.

If you all need a CU to test/provide feedback, I would be happy to volunteer!

Another update here.

We have put together a patch to collect the high entropy hints - Specifically Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Full-Version, Sec-CH-UA-Platform, Sec-CH-UA, Sec-CH-UA-Arch, Sec-CH-UA-Platform-Version, Sec-CH-UA-Mobile, and Sec-CH-UA-Model (In addition to the low entropy hints) and add them to the web requests table. Once deployed we will evaluate the solution to determine if we can reach data equivalency and determine what the dataloss might be.

• EChetty added a subtask: T301238: UA_CH - Implement getting High Entropy Hints into webrequest logs.Feb 21 2022, 3:00 PM

BTullis subscribed.Feb 23 2022, 5:18 PM

Quick update:

After some delay with testing and resourcing the Patch is ready to be deployed. Once landed we can report back on our findings.

TheresNoTime subscribed.Mar 3 2022, 7:32 PM

Blablubbs subscribed.Mar 15 2022, 1:55 PM

mpopov subscribed.Mar 16 2022, 9:00 PM

JAllemandou closed subtask T304850: Add CU-UA high entropy hints to Hive webrequest tables as Resolved.Apr 5 2022, 8:46 AM

Huji mentioned this in T305930: Normalize cu_changes table.Apr 12 2022, 6:25 PM

• EChetty closed subtask T301238: UA_CH - Implement getting High Entropy Hints into webrequest logs as Resolved.Apr 13 2022, 3:09 PM

Quick update:

Patch was finally deployed and we started collecting data as of April 5th in the wmf web requests table.
Goal now is the perform and validate some analysis over a weeks worth of data to estimate data loss and scope required data changes.

Reedy merged a task: T306287: Audit usage of navigator.userAgent, navigator.appVersion, and navigator.platform.Apr 16 2022, 2:07 AM

Reedy added a subscriber: Toad4707.

• EChetty moved this task from Work In Progress to Sign-off on the Foundational Technology Requests board.Nov 4 2022, 1:48 PM

RoySmith subscribed.Nov 8 2022, 4:24 PM

SBisson subscribed.Nov 16 2022, 5:56 PM

Following a discussion that happened here: https://phabricator.wikimedia.org/T257893#8573050 - This work has been paused until further notice.

• EChetty moved this task from Sign-off to QA/Review on the Foundational Technology Requests board.Feb 2 2023, 4:10 PM

Patch was reverted : https://phabricator.wikimedia.org/T257893#8588610

Next to decide on how to proceed.

Mayakp.wiki mentioned this in T310846: Improve Bot Detection Heuristics.Apr 26 2023, 10:48 PM

Checking in on the status of this issue. @Mayakp.wiki detected a large spike in pageviews that were being tagged as automated but look pretty clearly like human traffic (see T310846#8809323). The cause seems to be that the implementation by Chrome of the more generic user-agent seems to finally be rolling out in a substantial way (timeline) and so is breaking at least the bot detection pipelines in pretty significant ways. It seems the UA hints were dropped as it wasn't clear that we should be using them or that they would be of much benefit. Likely worth revisiting this conversation or considering alternatives though.

kostajh subscribed.May 3 2023, 12:38 PM

@IsaacJ indeed, I believe the bot detection was one consequence we were not expecting, I think this raises the Analytics impact significantly.

UPDATE: A quick initial check showed us that while user agent string entropy is dropping fast, it does not significantly affect our custom actor signature entropy too much, as most of that is based on IP. We are still investigating.

@DAbad: Resetting Due Date set for this open task as it passed a while ago.

Aklapper removed Due Date.May 8 2023, 9:14 AM

Aklapper removed subscribers: • STH, • EChetty, • sdkim.

Mayakp.wiki mentioned this in T336084: [SPIKE] Model impact of User-Agent deprecation on top line metrics.May 23 2023, 10:49 PM

kostajh mentioned this in T337819: Create project tag for google-chrome-user-agent-deprecation.May 31 2023, 8:10 AM

kostajh mentioned this in T337821: Create project tag for client-hints.May 31 2023, 8:13 AM

@DAbad should merge this task into T242825: Deal with Google Chrome User-Agent deprecation? AIUI they have the same scope, except that this one is tagged with Foundational Technology Requests. There is some conversation happening here that is not happening in T242825; it'd be nice to reduce fragmentation if it's not necessary.

Aklapper mentioned this in Google-Chrome-User-Agent-Deprecation.Jun 3 2023, 11:37 AM

Aklapper mentioned this in http-client-hints.Jun 3 2023, 11:40 AM

kostajh added a project: Google-Chrome-User-Agent-Deprecation.Jun 3 2023, 5:26 PM

Dreamy_Jazz mentioned this in T325306: Provide aggregated user device data per-country.Jun 30 2023, 5:50 PM

nshahquinn-wmf subscribed.Jul 12 2023, 11:18 PM

Iflorez mentioned this in T341743: Can we see chatgpt accesses in logs?.Jul 17 2023, 11:58 PM

dr0ptp4kt unsubscribed.Jul 28 2023, 3:16 PM

Tgr mentioned this in T345245: Mitigate phase-out of third-party cookies across MediaWiki in production.Aug 30 2023, 12:29 PM

I propose to close this and T242825: Deal with Google Chrome User-Agent deprecation. Related tasks can be tracked in Google-Chrome-User-Agent-Deprecation.

WDoranWMF closed subtask T336084: [SPIKE] Model impact of User-Agent deprecation on top line metrics as Resolved.Nov 20 2023, 6:34 PM

kostajh closed subtask T242825: Deal with Google Chrome User-Agent deprecation as Resolved.Feb 2 2024, 8:04 AM

In T295073#9198735, @kostajh wrote:

I propose to close this and T242825: Deal with Google Chrome User-Agent deprecation. Related tasks can be tracked in Google-Chrome-User-Agent-Deprecation.

Marking this as resolved.

Dreamy_Jazz mentioned this in T359312: Create cu_useragent table.Mar 6 2024, 11:32 AM

Dreamy_Jazz mentioned this in T361139: Normalise the user agent column in CheckUser result tables.Mar 27 2024, 6:04 PM

<Org-Wide Impact> Google Chrome User-Agent Deprecation ImpactClosed, ResolvedPublicActions