Page MenuHomePhabricator

XiaoXiao-WMF
Disabled

User Details

User Since
Nov 27 2023, 7:13 PM (125 w, 18 h)
Roles
Disabled
LDAP User
Unknown
MediaWiki User
XiaoXiao-WMF [ Global Accounts ]

Recent Activity

Apr 7 2025

XiaoXiao-WMF closed T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier as Resolved.
Apr 7 2025, 2:14 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF added a comment to T383361: Enable collection of JS features: fonts and canvas.

Closing the hypothesis ticket. This is an ongoing effort.

Apr 7 2025, 2:14 PM · Product Safety and Integrity, CheckUser, WE4.2 Anti-abuse
XiaoXiao-WMF closed T383061: Algorithm creation, a subtask of T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier , as Resolved.
Apr 7 2025, 2:13 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF closed T383061: Algorithm creation as Resolved.
Apr 7 2025, 2:13 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team
XiaoXiao-WMF closed T383060: Dataset creation, a subtask of T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier , as Resolved.
Apr 7 2025, 2:13 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF closed T383060: Dataset creation as Resolved.
Apr 7 2025, 2:13 PM · CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF added a comment to T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier .

The hypothesis is completed as per the scope in Q3. Here is the report: https://docs.google.com/document/d/1mmEgAJP3fh3H8p9LnWbUTrrPxew9Ola-og0JgUax404/edit?tab=t.0

Apr 7 2025, 2:12 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF added a comment to T384855: FY2024-25 WE4.3.11.

Left to do: MR from T386839 need to be resolved accordingly.

Apr 7 2025, 2:10 PM · Research
XiaoXiao-WMF closed T386840: Refine hashing algorithm and incorporate it in data search & discovery dashboard, a subtask of T384855: FY2024-25 WE4.3.11, as Resolved.
Apr 7 2025, 2:09 PM · Research
XiaoXiao-WMF closed T386840: Refine hashing algorithm and incorporate it in data search & discovery dashboard as Resolved.
Apr 7 2025, 2:09 PM · Research, Research-engineering
XiaoXiao-WMF added a comment to T386840: Refine hashing algorithm and incorporate it in data search & discovery dashboard.

Demo https://gitlab.wikimedia.org/repos/research/research-datasets/-/blob/mnz/webrequest-viz/notebooks/webrequest_clustering.ipynb?ref_type=heads

Apr 7 2025, 2:09 PM · Research, Research-engineering

Mar 28 2025

XiaoXiao-WMF updated subscribers of T384855: FY2024-25 WE4.3.11.

Copying Asana update:

Mar 28 2025, 4:23 PM · Research

Mar 25 2025

XiaoXiao-WMF added a comment to T389645: Discrepancy between `revert_risk` scores retrieved from Liftwing and `risk_observatory.revert_risk_predictions`.

Same with T388258, as I understand from @Kgraessle , 4.2.11 is unblocked by directly consuming Liftwing output, this is no longer a blocker.

Mar 25 2025, 6:22 PM · Research-engineering
XiaoXiao-WMF added a comment to T388453: Make the revert risk predictions datasets available for analysis.

As I understand from @Kgraessle , 4.2.11 is unblocked by directly consuming Liftwing output. Though this task will be fixed when wikidiff is fixed, this is no longer a blocker.

Mar 25 2025, 6:20 PM · Data-Engineering-Radar, Machine-Learning-Team, Data-Engineering, Essential-Work

Mar 24 2025

XiaoXiao-WMF added a comment to T388453: Make the revert risk predictions datasets available for analysis.

This blocks 4.2.11

Mar 24 2025, 3:06 PM · Data-Engineering-Radar, Machine-Learning-Team, Data-Engineering, Essential-Work
XiaoXiao-WMF closed T386842: Hashing algorithm experimentation, a subtask of T384855: FY2024-25 WE4.3.11, as Resolved.
Mar 24 2025, 1:28 PM · Research
XiaoXiao-WMF closed T386842: Hashing algorithm experimentation as Resolved.
Mar 24 2025, 1:28 PM · Research, Research-engineering
XiaoXiao-WMF closed T386843: Traffic detection algorithm, a subtask of T384855: FY2024-25 WE4.3.11, as Resolved.
Mar 24 2025, 1:27 PM · Research
XiaoXiao-WMF closed T386843: Traffic detection algorithm as Resolved.
Mar 24 2025, 1:27 PM · Research, Research-engineering
XiaoXiao-WMF added a comment to T386842: Hashing algorithm experimentation.

Experimental code, refactored: https://gitlab.wikimedia.org/xiaoxiao/web_scraping/-/tree/scraping_and_traffic

Mar 24 2025, 1:27 PM · Research, Research-engineering
XiaoXiao-WMF added a comment to T386843: Traffic detection algorithm.

Experimental code: https://gitlab.wikimedia.org/xiaoxiao/web_scraping/-/tree/scraping_and_traffic

Mar 24 2025, 1:27 PM · Research, Research-engineering

Mar 19 2025

XiaoXiao-WMF changed the status of T388453: Make the revert risk predictions datasets available for analysis from Open to In Progress.
Mar 19 2025, 7:34 PM · Data-Engineering-Radar, Machine-Learning-Team, Data-Engineering, Essential-Work

Mar 13 2025

XiaoXiao-WMF updated the task description for T388721: Support for FY2024-25 4.3.11 - webrequest based scraping detection.
Mar 13 2025, 12:25 PM · Data-Engineering, Essential-Work
XiaoXiao-WMF added a parent task for T388721: Support for FY2024-25 4.3.11 - webrequest based scraping detection: T384855: FY2024-25 WE4.3.11.
Mar 13 2025, 12:24 PM · Data-Engineering, Essential-Work
XiaoXiao-WMF added a subtask for T384855: FY2024-25 WE4.3.11: T388721: Support for FY2024-25 4.3.11 - webrequest based scraping detection.
Mar 13 2025, 12:24 PM · Research
XiaoXiao-WMF updated subscribers of T388721: Support for FY2024-25 4.3.11 - webrequest based scraping detection.
Mar 13 2025, 12:22 PM · Data-Engineering, Essential-Work

Feb 25 2025

XiaoXiao-WMF created T387226: Members of https://ldap.toolforge.org/group/project-recommendation-api not added to project-bastion.
Feb 25 2025, 3:39 PM · Cloud-VPS, User-aborrero, cloud-services-team

Feb 19 2025

XiaoXiao-WMF updated the task description for T386839: Create data pipeline for the hashing algorithm.
Feb 19 2025, 6:15 PM · Research, Research-engineering
XiaoXiao-WMF changed the status of T386843: Traffic detection algorithm from Open to In Progress.
Feb 19 2025, 4:05 PM · Research, Research-engineering
XiaoXiao-WMF changed the status of T386843: Traffic detection algorithm, a subtask of T384855: FY2024-25 WE4.3.11, from Open to In Progress.
Feb 19 2025, 4:05 PM · Research
XiaoXiao-WMF created T386843: Traffic detection algorithm.
Feb 19 2025, 4:05 PM · Research, Research-engineering
XiaoXiao-WMF added a comment to T386842: Hashing algorithm experimentation.

Sample code: https://gitlab.wikimedia.org/xiaoxiao/web_scraping/-/tree/experimental?ref_type=heads

Feb 19 2025, 4:03 PM · Research, Research-engineering
XiaoXiao-WMF changed the status of T386842: Hashing algorithm experimentation, a subtask of T384855: FY2024-25 WE4.3.11, from Open to In Progress.
Feb 19 2025, 4:03 PM · Research
XiaoXiao-WMF changed the status of T386842: Hashing algorithm experimentation from Open to In Progress.
Feb 19 2025, 4:03 PM · Research, Research-engineering
XiaoXiao-WMF created T386842: Hashing algorithm experimentation.
Feb 19 2025, 4:02 PM · Research, Research-engineering
XiaoXiao-WMF changed the status of T384855: FY2024-25 WE4.3.11 from Open to In Progress.
Feb 19 2025, 4:01 PM · Research
XiaoXiao-WMF changed the status of T386840: Refine hashing algorithm and incorporate it in data search & discovery dashboard, a subtask of T384855: FY2024-25 WE4.3.11, from Open to In Progress.
Feb 19 2025, 4:01 PM · Research
XiaoXiao-WMF changed the status of T386840: Refine hashing algorithm and incorporate it in data search & discovery dashboard from Open to In Progress.
Feb 19 2025, 4:01 PM · Research, Research-engineering
XiaoXiao-WMF created T386840: Refine hashing algorithm and incorporate it in data search & discovery dashboard.
Feb 19 2025, 3:59 PM · Research, Research-engineering
XiaoXiao-WMF changed the status of T386839: Create data pipeline for the hashing algorithm, a subtask of T384855: FY2024-25 WE4.3.11, from Open to In Progress.
Feb 19 2025, 3:57 PM · Research
XiaoXiao-WMF changed the status of T386839: Create data pipeline for the hashing algorithm from Open to In Progress.
Feb 19 2025, 3:57 PM · Research, Research-engineering
XiaoXiao-WMF updated subscribers of T384855: FY2024-25 WE4.3.11.
Feb 19 2025, 3:54 PM · Research
XiaoXiao-WMF updated subscribers of T386839: Create data pipeline for the hashing algorithm.
Feb 19 2025, 3:54 PM · Research, Research-engineering
XiaoXiao-WMF added a project to T386839: Create data pipeline for the hashing algorithm: Research-engineering.
Feb 19 2025, 3:53 PM · Research, Research-engineering
XiaoXiao-WMF created T386839: Create data pipeline for the hashing algorithm.
Feb 19 2025, 3:53 PM · Research, Research-engineering

Feb 13 2025

XiaoXiao-WMF edited projects for T379543: Update the research team's DAGs to use miniforge instead of miniconda, added: Research (FY2024-25-Research-January-March); removed Research.
Feb 13 2025, 3:21 PM · Data-Platform-SRE (2025.02.10 - 2025.02.28), Research (FY2024-25-Research-January-March), Research-engineering
XiaoXiao-WMF added a project to T380874: Incremental HTML wiki content dataset to support "Who are moderators": Research-Freezer.
Feb 13 2025, 1:52 PM · Data-Engineering, Research
XiaoXiao-WMF added a comment to T380874: Incremental HTML wiki content dataset to support "Who are moderators".

DP decided to not prioritize it in Q3. Moving to freezer.

Feb 13 2025, 1:52 PM · Data-Engineering, Research

Feb 6 2025

XiaoXiao-WMF updated the task description for T384855: FY2024-25 WE4.3.11.
Feb 6 2025, 4:19 PM · Research

Jan 31 2025

XiaoXiao-WMF updated subscribers of T384855: FY2024-25 WE4.3.11.
Jan 31 2025, 3:50 PM · Research
XiaoXiao-WMF updated the task description for T384855: FY2024-25 WE4.3.11.
Jan 31 2025, 3:49 PM · Research

Jan 28 2025

XiaoXiao-WMF edited projects for T382068: Relforge embedding experimentation, added: Research; removed Research (FY2024-25-Research-January-March).
Jan 28 2025, 5:27 PM · Research, Research-engineering
XiaoXiao-WMF assigned T376204: TempAccount updates to research pipelines to fkaelin.
Jan 28 2025, 5:25 PM · Research (FY2024-25-Research-January-March), Research-engineering

Jan 27 2025

XiaoXiao-WMF created T384855: FY2024-25 WE4.3.11.
Jan 27 2025, 4:34 PM · Research

Jan 14 2025

XiaoXiao-WMF closed T377496: Phase 1: LLM inference - base metrics, a subtask of T377159: [SDS 1.2.1 B] Test existing AI models for internal use-cases, as Resolved.
Jan 14 2025, 2:23 PM · Research
XiaoXiao-WMF closed T377496: Phase 1: LLM inference - base metrics, a subtask of T377498: Phase 2: Article categorization metrics, fine-tuning metrics, optimization tooling, as Resolved.
Jan 14 2025, 2:23 PM · Research-engineering, Research
XiaoXiao-WMF closed T377496: Phase 1: LLM inference - base metrics as Resolved.

Research Engineering steps are complete. Results have been added to report.

Jan 14 2025, 2:23 PM · Research-engineering, Research

Jan 9 2025

XiaoXiao-WMF triaged T383361: Enable collection of JS features: fonts and canvas as High priority.
Jan 9 2025, 7:10 PM · Product Safety and Integrity, CheckUser, WE4.2 Anti-abuse
XiaoXiao-WMF added a comment to T383361: Enable collection of JS features: fonts and canvas.

@kostajh please triage and adjust the ticket with deadlines etc...

Jan 9 2025, 7:10 PM · Product Safety and Integrity, CheckUser, WE4.2 Anti-abuse
XiaoXiao-WMF created T383361: Enable collection of JS features: fonts and canvas.
Jan 9 2025, 7:07 PM · Product Safety and Integrity, CheckUser, WE4.2 Anti-abuse

Jan 8 2025

XiaoXiao-WMF added a project to T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier : Research-engineering.
Jan 8 2025, 7:47 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse

Jan 7 2025

XiaoXiao-WMF assigned T383061: Algorithm creation to MunizaA.
Jan 7 2025, 2:29 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team

Jan 6 2025

XiaoXiao-WMF changed the status of T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier from Open to In Progress.
Jan 6 2025, 9:28 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF changed the status of T383061: Algorithm creation, a subtask of T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier , from Open to In Progress.
Jan 6 2025, 9:26 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF changed the status of T383061: Algorithm creation from Open to In Progress.
Jan 6 2025, 9:26 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team
XiaoXiao-WMF changed the status of T383060: Dataset creation, a subtask of T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier , from Open to In Progress.
Jan 6 2025, 5:04 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF changed the status of T383060: Dataset creation from Open to In Progress.
Jan 6 2025, 5:04 PM · CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF updated the task description for T383060: Dataset creation.
Jan 6 2025, 5:04 PM · CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF assigned T383060: Dataset creation to fkaelin.
Jan 6 2025, 4:24 PM · CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF removed a project from T383060: Dataset creation: Research (FY2024-25-Research-January-March).
Jan 6 2025, 4:17 PM · CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF edited projects for T383061: Algorithm creation, added: WE4.2 Anti-abuse; removed Research (FY2024-25-Research-January-March).
Jan 6 2025, 4:16 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team
XiaoXiao-WMF removed a project from T383061: Algorithm creation: WE4.2 Anti-abuse.
Jan 6 2025, 4:15 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team
XiaoXiao-WMF updated the task description for T383061: Algorithm creation.
Jan 6 2025, 3:08 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team
XiaoXiao-WMF created T383061: Algorithm creation.
Jan 6 2025, 3:06 PM · WE4.2 Anti-abuse, CheckUser, Trust and Safety Product Team
XiaoXiao-WMF created T383060: Dataset creation.
Jan 6 2025, 3:01 PM · CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse

Dec 13 2024

XiaoXiao-WMF added a comment to T377498: Phase 2: Article categorization metrics, fine-tuning metrics, optimization tooling.

Stretch goals may not be completed by end of Q2 - will continue in Q3.

Dec 13 2024, 7:01 PM · Research-engineering, Research
XiaoXiao-WMF assigned T377498: Phase 2: Article categorization metrics, fine-tuning metrics, optimization tooling to MunizaA.
Dec 13 2024, 7:01 PM · Research-engineering, Research
XiaoXiao-WMF assigned T382070: Deploy pipeline under DSE namespace to fkaelin.
Dec 13 2024, 6:58 PM · Data-Platform-SRE (2025.07.05 - 2025.07.25), Research-engineering, Research
XiaoXiao-WMF renamed T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier from WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier locality-sensitive hash to WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier .
Dec 13 2024, 6:57 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF triaged T382070: Deploy pipeline under DSE namespace as High priority.
Dec 13 2024, 6:56 PM · Data-Platform-SRE (2025.07.05 - 2025.07.25), Research-engineering, Research
XiaoXiao-WMF edited projects for T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier , added: Research (FY2024-25-Research-January-March); removed Research.
Dec 13 2024, 6:55 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse
XiaoXiao-WMF edited projects for T382072: Offline pipelines, added: Research, Research-engineering; removed Research (FY2024-25-Research-January-March).
Dec 13 2024, 4:05 PM · Research-engineering, Research

Dec 12 2024

XiaoXiao-WMF created T382072: Offline pipelines.
Dec 12 2024, 2:36 PM · Research-engineering, Research
XiaoXiao-WMF created T382070: Deploy pipeline under DSE namespace.
Dec 12 2024, 2:25 PM · Data-Platform-SRE (2025.07.05 - 2025.07.25), Research-engineering, Research
XiaoXiao-WMF edited projects for T382068: Relforge embedding experimentation, added: Research (FY2024-25-Research-January-March); removed Research.
Dec 12 2024, 2:14 PM · Research, Research-engineering
XiaoXiao-WMF created T382068: Relforge embedding experimentation.
Dec 12 2024, 2:14 PM · Research, Research-engineering

Dec 5 2024

XiaoXiao-WMF added a comment to T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier .

Update ext.checkUser.clientHints to obtain list of fonts and generate a canvas fingerprint

Have we exhausted all avenues of passive fingerprinting? Canvas and font fingerprinting feel like a massive overreach in terms of violating a user's privacy in a way that a user cannot explicitly opt out of. (Outside of ceasing to edit Wikipedia)

Dec 5 2024, 6:17 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse

Dec 3 2024

XiaoXiao-WMF removed a project from T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier : ml-model-requests.
Dec 3 2024, 1:50 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse

Nov 28 2024

XiaoXiao-WMF moved T376204: TempAccount updates to research pipelines from Backlog to FY2024-25-Research-January-March on the Research board.
Nov 28 2024, 8:04 PM · Research (FY2024-25-Research-January-March), Research-engineering
XiaoXiao-WMF moved T380752: Migrate Relforge to Opensearch from Backlog to Watching on the Research board.
Nov 28 2024, 7:59 PM · Data-Platform-SRE (2025.03.22 - 2025.04.11), Patch-For-Review, Discovery-Search (2025.03.01 - 2025.03.21), Research
XiaoXiao-WMF moved T360794: Event stream with latest revision HTML & parent revision HTML diff from Backlog to Watching on the Research board.
Nov 28 2024, 7:57 PM · Data-Engineering (Q4 FS25/26 April 1st - June 30st), Patch-For-Review, Research, Event-Platform
XiaoXiao-WMF added a project to T360794: Event stream with latest revision HTML & parent revision HTML diff: Research.
Nov 28 2024, 7:56 PM · Data-Engineering (Q4 FS25/26 April 1st - June 30st), Patch-For-Review, Research, Event-Platform

Nov 27 2024

XiaoXiao-WMF created T381031: WE4.2.10 Add more browser signals to client hints pipeline to generate unique device identifier .
Nov 27 2024, 7:18 PM · Research, OKR-Work, Research-engineering, CheckUser, Trust and Safety Product Team, WE4.2 Anti-abuse

Nov 26 2024

XiaoXiao-WMF added a parent task for T360794: Event stream with latest revision HTML & parent revision HTML diff: T380874: Incremental HTML wiki content dataset to support "Who are moderators".
Nov 26 2024, 5:48 PM · Data-Engineering (Q4 FS25/26 April 1st - June 30st), Patch-For-Review, Research, Event-Platform
XiaoXiao-WMF added a subtask for T380874: Incremental HTML wiki content dataset to support "Who are moderators": T360794: Event stream with latest revision HTML & parent revision HTML diff.
Nov 26 2024, 5:48 PM · Data-Engineering, Research
XiaoXiao-WMF updated the task description for T380874: Incremental HTML wiki content dataset to support "Who are moderators".
Nov 26 2024, 5:48 PM · Data-Engineering, Research

Nov 20 2024

XiaoXiao-WMF changed the status of T377266: DSE kubernetes namespace for llm-inference from Open to In Progress.
Nov 20 2024, 3:10 PM · Data-Platform-SRE (2024.11.30 - 2024.12.20), Research-engineering, Data-Platform, Research
XiaoXiao-WMF removed a project from T372707: research code hand-over and resolve requests/comments from research engineers: Research.
Nov 20 2024, 3:09 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity

Nov 19 2024

XiaoXiao-WMF moved T379288: Plan for access control with opensearch from Backlog to Watching on the Research board.
Nov 19 2024, 3:31 PM · Discovery-Search
XiaoXiao-WMF added a comment to T379543: Update the research team's DAGs to use miniforge instead of miniconda.

@BTullis Can you please comment on the urgency/timeline if you have any for updating the DAG?

Nov 19 2024, 3:31 PM · Data-Platform-SRE (2025.02.10 - 2025.02.28), Research (FY2024-25-Research-January-March), Research-engineering