Page MenuHomePhabricator

achou (AikoChou)
Machine Learning Engineer

Today

  • No visible events.

Tomorrow

  • No visible events.

Sunday

  • No visible events.

User Details

User Since
Feb 15 2022, 2:51 PM (198 w, 2 d)
Availability
Available
IRC Nick
aiko
LDAP User
Unknown
MediaWiki User
AChou-WMF [ Global Accounts ]

Recent Activity

Mon, Dec 1

achou moved T410663: Upgrade AMD GPU + torch version of ML Labs machines from Unsorted to Ready To Go on the Machine-Learning-Team board.
Mon, Dec 1, 4:52 PM · Essential-Work, Machine-Learning-Team
achou added a project to T410663: Upgrade AMD GPU + torch version of ML Labs machines: Essential-Work.
Mon, Dec 1, 4:51 PM · Essential-Work, Machine-Learning-Team
achou moved T409414: Configure Lift Wing isvc Integration with Cassandra from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Mon, Dec 1, 4:50 PM · Machine-Learning-Team
achou moved T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1 from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Mon, Dec 1, 4:50 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou moved T408533: Initial task generation and ingestion to Cassandra and Search weight tags from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Mon, Dec 1, 4:50 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou moved T411082: Remove old GPUs from ml-serve1001 from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Mon, Dec 1, 4:50 PM · SRE, DC-Ops, ops-eqiad, Machine-Learning-Team
achou closed T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1, a subtask of T408538: Create a Revise Tone Task Generator in LiftWing, as Declined.
Mon, Dec 1, 1:31 PM · Patch-For-Review, Machine-Learning-Team
achou closed T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1 as Declined.
Mon, Dec 1, 1:31 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou updated the task description for T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.
Mon, Dec 1, 1:31 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou assigned T394778: Build and push images to the docker registry from ml-lab to DPogorzelski-WMF.
Mon, Dec 1, 1:04 PM · Patch-For-Review, Machine-Learning-Team
achou moved T394778: Build and push images to the docker registry from ml-lab from Blocked to In Progress on the Machine-Learning-Team board.
Mon, Dec 1, 1:01 PM · Patch-For-Review, Machine-Learning-Team
achou reassigned T409414: Configure Lift Wing isvc Integration with Cassandra from achou to DPogorzelski-WMF.
Mon, Dec 1, 12:59 PM · Machine-Learning-Team
achou closed T409414: Configure Lift Wing isvc Integration with Cassandra, a subtask of T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task, as Resolved.
Mon, Dec 1, 12:59 PM · OKR-Work, Goal, Machine-Learning-Team
achou closed T409414: Configure Lift Wing isvc Integration with Cassandra as Resolved.

Thanks for everyone's help. This task is resolved. :)

Mon, Dec 1, 12:59 PM · Machine-Learning-Team

Fri, Nov 28

achou updated the task description for T408533: Initial task generation and ingestion to Cassandra and Search weight tags.
Fri, Nov 28, 12:43 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou closed T408533: Initial task generation and ingestion to Cassandra and Search weight tags as Resolved.
Fri, Nov 28, 12:40 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou closed T408533: Initial task generation and ingestion to Cassandra and Search weight tags, a subtask of T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task, as Resolved.
Fri, Nov 28, 12:40 PM · OKR-Work, Goal, Machine-Learning-Team
achou added a comment to T408533: Initial task generation and ingestion to Cassandra and Search weight tags.

[...]
it was designed to support large datasets but I suspect that if your dataset is relatively small (<100000 pages) you can use the same strategy you used to push the initial test data via event-gate?

Fri, Nov 28, 12:39 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou updated the task description for T408533: Initial task generation and ingestion to Cassandra and Search weight tags.
Fri, Nov 28, 12:33 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou added a comment to T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.

Weekly Report

Fri, Nov 28, 12:11 PM · OKR-Work, Goal, Machine-Learning-Team

Wed, Nov 26

achou moved T409388: Test liftwing wikidata revert risk API for scale and latency from Unsorted to Watching on the Machine-Learning-Team board.
Wed, Nov 26, 2:33 PM · Machine-Learning-Team, Wikimedia Enterprise
achou moved T411082: Remove old GPUs from ml-serve1001 from Unsorted to In Progress on the Machine-Learning-Team board.
Wed, Nov 26, 2:30 PM · SRE, DC-Ops, ops-eqiad, Machine-Learning-Team

Tue, Nov 25

achou created P85614 transformers-4.57.2 AttributeError (revise_tone_task_generator).
Tue, Nov 25, 1:54 PM
achou moved T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing from In Progress to Current Quarter Goals on the Machine-Learning-Team board.
Tue, Nov 25, 1:31 PM · OKR-Work, Goal, Wikimedia Enterprise - Content Integrity, Wikimedia Enterprise, Wikidata, Lift-Wing, Machine-Learning-Team
achou moved T405647: eqiad row C/D Machine Learning host migrations from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Tue, Nov 25, 11:26 AM · Machine-Learning-Team, SRE, DC-Ops, ops-eqiad

Fri, Nov 21

achou added a comment to T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.

Weekly Report

Fri, Nov 21, 3:48 PM · OKR-Work, Goal, Machine-Learning-Team
achou created P85434 reference-risk extended_output.
Fri, Nov 21, 12:43 PM

Thu, Nov 20

achou created P85413 Revise tone weighted tag.
Thu, Nov 20, 12:41 PM

Tue, Nov 18

achou updated subscribers of T408533: Initial task generation and ingestion to Cassandra and Search weight tags.

Hi @pfischer @dcausse, ML team wants to follow up on the initial ingestion process. As you mentioned before, the Search platform team has a manual script for this purpose. Can the ML team execute this on our end (e.g., in statbox)? Or can only the Search team execute it?

Tue, Nov 18, 1:02 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou moved T404722: Incorporate notebook into Tone-Check data generation ml-pipeline from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Tue, Nov 18, 8:28 AM · Essential-Work, Machine-Learning-Team
achou moved T406217: Export retrained Tone-check model to an S3 bucket from In Progress to Ready To Go on the Machine-Learning-Team board.
Tue, Nov 18, 8:27 AM · Patch-For-Review, Machine-Learning-Team
achou moved T396495: Build model training pipeline for tone check using WMF ML Airflow instance from In Progress to Ready To Go on the Machine-Learning-Team board.
Tue, Nov 18, 8:27 AM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, Editing-team (Tracking), Machine-Learning-Team
achou closed T404722: Incorporate notebook into Tone-Check data generation ml-pipeline, a subtask of T396495: Build model training pipeline for tone check using WMF ML Airflow instance, as Resolved.
Tue, Nov 18, 8:27 AM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, Editing-team (Tracking), Machine-Learning-Team
achou closed T404722: Incorporate notebook into Tone-Check data generation ml-pipeline as Resolved.
Tue, Nov 18, 8:27 AM · Essential-Work, Machine-Learning-Team
achou closed T408607: AI/ML Infrastructure Request: Assistance in Rolling out Revert Risk to wikis that don't have damaging/goodfaith models, a subtask of T408388: WE 1.3.4 Roll out Revert Risk Filters to Wikis that don't have damaging/goodfaith Edit Models, as Resolved.
Tue, Nov 18, 8:25 AM · OKR-Work, Machine-Learning-Team, MediaWiki-extensions-ORES, PersonalDashboard, MediaWiki-Recent-changes, Moderator-Tools-Team
achou closed T408607: AI/ML Infrastructure Request: Assistance in Rolling out Revert Risk to wikis that don't have damaging/goodfaith models as Resolved.
Tue, Nov 18, 8:25 AM · Patch-For-Review, MediaWiki-Recent-changes, PersonalDashboard, Moderator-Tools-Team, Machine-Learning-Team
achou moved T408388: WE 1.3.4 Roll out Revert Risk Filters to Wikis that don't have damaging/goodfaith Edit Models from Unsorted to Watching on the Machine-Learning-Team board.
Tue, Nov 18, 8:24 AM · OKR-Work, Machine-Learning-Team, MediaWiki-extensions-ORES, PersonalDashboard, MediaWiki-Recent-changes, Moderator-Tools-Team
achou moved T408516: DIMM_A2 errors for ml-serve2001 from Unsorted to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Tue, Nov 18, 8:23 AM · SRE, DC-Ops, ops-codfw, Machine-Learning-Team
achou moved T409438: Enable revertrisk filters in thwiki from Unsorted to Watching on the Machine-Learning-Team board.
Tue, Nov 18, 8:22 AM · Patch-For-Review, Moderator-Tools-Team (Kanban), OKR-Work, Machine-Learning-Team, MediaWiki-extensions-ORES, PersonalDashboard, MediaWiki-Recent-changes
achou added a project to T409866: Iterate on Annotool functionality to support more use cases: Essential-Work.
Tue, Nov 18, 8:21 AM · Essential-Work, Machine-Learning-Team
achou added a comment to T407843: Introduce re-try mechanisms for MW API requests in LiftWing models.

T363725 might be related, though that task focuses on handling redirects.

Tue, Nov 18, 8:15 AM · Essential-Work, Machine-Learning-Team
achou moved T408702: Promote dpogorzelski from ops-limited to ops from Blocked to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Tue, Nov 18, 8:09 AM · SRE, SRE-Access-Requests, Machine-Learning-Team
achou moved T405647: eqiad row C/D Machine Learning host migrations from Unsorted to In Progress on the Machine-Learning-Team board.
Tue, Nov 18, 8:05 AM · Machine-Learning-Team, SRE, DC-Ops, ops-eqiad
achou moved T409414: Configure Lift Wing isvc Integration with Cassandra from Unsorted to In Progress on the Machine-Learning-Team board.
Tue, Nov 18, 8:02 AM · Machine-Learning-Team
achou moved T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1 from Unsorted to In Progress on the Machine-Learning-Team board.
Tue, Nov 18, 8:02 AM · Data-Engineering, serviceops, Machine-Learning-Team
achou moved T409850: Cassandra role & grants for Lift Wing isvc integration from Unsorted to Watching on the Machine-Learning-Team board.
Tue, Nov 18, 8:02 AM · Data-Persistence, Machine-Learning-Team
achou assigned T407155: [SPIKE] Define process for validating Tone Check model eval data for languages staff members do not speak to gkyziridis.
Tue, Nov 18, 8:02 AM · Machine-Learning-Team, EditCheck, VisualEditor
achou moved T407155: [SPIKE] Define process for validating Tone Check model eval data for languages staff members do not speak from Unsorted to In Progress on the Machine-Learning-Team board.
Tue, Nov 18, 8:01 AM · Machine-Learning-Team, EditCheck, VisualEditor
achou placed T403599: Setup & experiments for MI300x GPUs used for LiftWing up for grabs.
Tue, Nov 18, 8:01 AM · Machine-Learning-Team
achou moved T403697: Experiment with amd-smi and the new AMD GPUs MI300x from Unsorted to In Progress on the Machine-Learning-Team board.
Tue, Nov 18, 8:01 AM · Machine-Learning-Team
achou assigned T403599: Setup & experiments for MI300x GPUs used for LiftWing to DPogorzelski-WMF.
Tue, Nov 18, 8:00 AM · Machine-Learning-Team
achou moved T403599: Setup & experiments for MI300x GPUs used for LiftWing from Unsorted to In Progress on the Machine-Learning-Team board.
Tue, Nov 18, 7:59 AM · Machine-Learning-Team
achou closed T392283: Q1 FY2025-26 Goal: Apply the Tone Check model to published articles, to learn whether we can build a pool of high-quality structured tasks for new editors, a subtask of T396162: [EPIC] Revise Tone: Structured Task (WE1.1.2, FY25-26), as Resolved.
Tue, Nov 18, 7:59 AM · Patch-For-Review, Revise-Tone-Structured-Task, OKR-Work, Epic, EditCheck, Growth-Structured-Tasks, Growth-Team
achou closed T392283: Q1 FY2025-26 Goal: Apply the Tone Check model to published articles, to learn whether we can build a pool of high-quality structured tasks for new editors as Resolved.

A continuation of this task is T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task

Tue, Nov 18, 7:59 AM · OKR-Work, Goal, Machine-Learning-Team
achou added a comment to T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.

Hi, thanks all for the input. :) Due to our tight timeline, ML team has decided to move forward with Option D for now.

That said, we can later define a set of performance expectations (eg for latency), which will then help us to assess whether any of the other options would provide sufficient benefit to justify any additional efforts. Thoughts?

I agree! We should follow up on this and revisit the topic in the future. ML team would really like to see this work happen, as we will have other similar use cases that could benefit from mediawiki.page_content_change.v1 in Kafka main.

Tue, Nov 18, 7:49 AM · Data-Engineering, serviceops, Machine-Learning-Team

Mon, Nov 17

achou created P85341 typing_extensions dependency conflict error.
Mon, Nov 17, 11:25 AM

Fri, Nov 14

achou added a comment to T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.

Weekly Report

Fri, Nov 14, 3:59 PM · OKR-Work, Goal, Machine-Learning-Team

Thu, Nov 13

achou added a comment to T408533: Initial task generation and ingestion to Cassandra and Search weight tags.

After meeting with @Michael today, we agreed to first enable Testwiki for more controlled experimentation with both the update pipeline and the Newcomer Task integration. This means we will (1) load the initial Testwiki dataset to staging Cassandra and Search weight tags, and (2) enable the Revise Tone Task Generator on Lift Wing for Testwiki.
cc @BWojtowicz-WMF

Thu, Nov 13, 2:48 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou added projects to T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1: serviceops, Data-Engineering.
Thu, Nov 13, 1:37 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou added a comment to T408538: Create a Revise Tone Task Generator in LiftWing.

@BWojtowicz-WMF We have the initial dataset for frwiki. We can use this dataset to test our new service.
Once the Cassandra <-> Lift Wing connection is built, we can load this data to staging Cassandra from Lift Wing. Then using test events to trigger Lift Wing updates in Cassandra and verify our Cassandra integration works.

Thu, Nov 13, 8:39 AM · Patch-For-Review, Machine-Learning-Team
achou added a comment to T408533: Initial task generation and ingestion to Cassandra and Search weight tags.

We have the initial dataset for frwiki.

Thu, Nov 13, 8:27 AM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou created P85300 aya-llm error.
Thu, Nov 13, 8:14 AM
achou updated subscribers of T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.

Hi @Joe! The Machine Learning and Growth teams are collaborating on a GrowthExperiments newcomer task for revising tone (associated hypotheses are WE1.1.2 & WE1.1.17).

Thu, Nov 13, 8:11 AM · Data-Engineering, serviceops, Machine-Learning-Team

Wed, Nov 12

achou added a comment to T409850: Cassandra role & grants for Lift Wing isvc integration.

With respect to GRANTs, is it safe to assume that MODIFY is sufficient? There is no requirement to do reads here, is there?

When the service starts, Lift Wing will validate whether the target table exists, so we'll need SELECT as well. @BWojtowicz-WMF, is it correct?

Wed, Nov 12, 12:31 PM · Data-Persistence, Machine-Learning-Team

Tue, Nov 11

achou added a comment to T409414: Configure Lift Wing isvc Integration with Cassandra.

@Eevans i guess we can just start with a set of shared credentials and split later if needed

These clusters are managed as multi-tenant, so what I'm trying to establish here is if this is logically one tenant, or many (two currently). If what is writing to Cassandra is some piece of shared infrastructure or service (presumably a single code repository), than that would be one tenant (one set of credentials). If each project contains the code that manages connections to Cassandra, those are separate tenants, and we should create a role for each.

Tue, Nov 11, 12:41 PM · Machine-Learning-Team

Mon, Nov 10

achou added a comment to T409414: Configure Lift Wing isvc Integration with Cassandra.

@DPogorzelski-WMF The service to connect to Cassandra is the revise-tone-task-generator that @BWojtowicz-WMF is working on in T408538. Currently, it is only deployed in the experimental namespace on ml-staging. We're thinking to either create a new namespace for this service or deploy it under the edit-check namespace.

Mon, Nov 10, 3:54 PM · Machine-Learning-Team
achou updated subscribers of T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.

Option A would require some talk with SRE but given the size of the topic and the current /srv usage in main-eqiad / codfw I don't see any big opposition in having the stream hosted there (especially if we advertise that ML will not need to query the mediawiki API as direct consequence for the use case). It would probably be the most clean and reliable option in my opinion.

Mon, Nov 10, 1:02 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou added a comment to T409414: Configure Lift Wing isvc Integration with Cassandra.

regarding 2. would flipping egress to true here be sufficient? https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/kserve-inference/values.yaml#43 or perhaps a specific policy in GlobalNetworkPolicies in ml-serve.yaml?

Mon, Nov 10, 12:22 PM · Machine-Learning-Team

Fri, Nov 7

achou added a comment to T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.

Weekly Report

Fri, Nov 7, 4:45 PM · OKR-Work, Goal, Machine-Learning-Team
achou updated subscribers of T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.

LiftWing is only in eqiad (right?)

LiftWing is in both eqiad and codfw

Fri, Nov 7, 11:58 AM · Data-Engineering, serviceops, Machine-Learning-Team

Thu, Nov 6

achou added a parent task for T408129: Provision Cassandra + Data Gateway resources for Tone Check: T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.
Thu, Nov 6, 5:46 PM · Cassandra, OKR-Work, Goal, Machine-Learning-Team
achou added a subtask for T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task: T408129: Provision Cassandra + Data Gateway resources for Tone Check.
Thu, Nov 6, 5:46 PM · OKR-Work, Goal, Machine-Learning-Team
achou added a subtask for T396162: [EPIC] Revise Tone: Structured Task (WE1.1.2, FY25-26): T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.
Thu, Nov 6, 5:45 PM · Patch-For-Review, Revise-Tone-Structured-Task, OKR-Work, Epic, EditCheck, Growth-Structured-Tasks, Growth-Team
achou added a parent task for T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task: T396162: [EPIC] Revise Tone: Structured Task (WE1.1.2, FY25-26).
Thu, Nov 6, 5:44 PM · OKR-Work, Goal, Machine-Learning-Team
achou added a parent task for T409414: Configure Lift Wing isvc Integration with Cassandra: T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.
Thu, Nov 6, 5:42 PM · Machine-Learning-Team
achou added a subtask for T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task: T409414: Configure Lift Wing isvc Integration with Cassandra.
Thu, Nov 6, 5:42 PM · OKR-Work, Goal, Machine-Learning-Team
achou added a subtask for T408538: Create a Revise Tone Task Generator in LiftWing: T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.
Thu, Nov 6, 5:35 PM · Patch-For-Review, Machine-Learning-Team
achou added a parent task for T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1: T408538: Create a Revise Tone Task Generator in LiftWing.
Thu, Nov 6, 5:35 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou created T409469: Enable ChangeProp to consume mediawiki.page_content_change.v1.
Thu, Nov 6, 5:35 PM · Data-Engineering, serviceops, Machine-Learning-Team
achou added a comment to T409414: Configure Lift Wing isvc Integration with Cassandra.

@DPogorzelski-WMF Yes, Cassandra is on the prod network, and @Eevans should be able to provide more info about this.

Thu, Nov 6, 1:03 PM · Machine-Learning-Team
achou added a comment to T409414: Configure Lift Wing isvc Integration with Cassandra.

Hi @klausman, I'd love to hear your thoughts on what we need to do to make this Cassandra integration.

Thu, Nov 6, 11:00 AM · Machine-Learning-Team
achou created T409414: Configure Lift Wing isvc Integration with Cassandra.
Thu, Nov 6, 10:49 AM · Machine-Learning-Team

Wed, Nov 5

achou moved T405324: Create a notebook for revise tone structured task generation logic from In Progress to 2025-2026 Q2 Done on the Machine-Learning-Team board.
Wed, Nov 5, 10:33 AM · Machine-Learning-Team
achou closed T405324: Create a notebook for revise tone structured task generation logic as Resolved.

Done. This is the notebook that demonstrates how to generate tasks.

Wed, Nov 5, 10:32 AM · Machine-Learning-Team
achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

@dcausse Thanks a lot! I found it was also missing $schema. (eventgate complained about it)

Wed, Nov 5, 9:46 AM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence
achou added a comment to T408533: Initial task generation and ingestion to Cassandra and Search weight tags.

I've collected articles in English (en), French (fr), Arabic (ar), and Japanese (ja), then generated paragraph data using Spark.

  • Article topics
"Culture.Biography.Biography*",
"Culture.Biography.Women",
"Culture.Sports",
  • Data cleaning
    1. Sections to skip
"en": [
    'See also',
    'References',
    'External links',
    'Further reading',
    'Notes',
    'Additional sources',
    'Sources',
    'Bibliography'
],
"fr": [
    'Notes et références',
    'Annexes',
    'Bibliographie',
    'Articles connexes',
    'Liens externes',
    'Voir aussi',
    'Notes',
    'Références'
],
"ar": [
    'وصلات خارجية',
    'قراءة موسَّعة',
    'الهوامش',
    'انظر أيضاً',
    'الاستشهاد بالمصادر',
    'انظر أيضًا',
    'مراجع',
],
"ja": [
    '脚注',
    '参考文献',
    '関連項目',
],
  1. Prefixes for links/files/category to remove
"en": ("file:", "image:", "category:"),
"fr": ("fichier:", "image:", "catégorie:"),
"ar": ("صورة" ,"ملف" ,"تصنيف"),
"ja": ("file:", "image:", "category:"),
  1. Paragraphs/plaintext that start with to skip
    • *: list items
    • |: table or template leftovers
    • <blockquote>
    • <ref>
Wed, Nov 5, 9:23 AM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team

Nov 4 2025

achou updated the task description for T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.
Nov 4 2025, 7:51 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence
achou updated the task description for T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.
Nov 4 2025, 7:50 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence
achou claimed T408533: Initial task generation and ingestion to Cassandra and Search weight tags.
Nov 4 2025, 7:03 PM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team
achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

Given our tight timeline, we'd like to have Cassandra and the Data Gateway ready this week, so we can begin integrating with Lift Wing soon. I need to make the final call to move things forward. I've read through both of your points, they're all valid.

Nov 4 2025, 4:36 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence

Nov 3 2025

achou renamed T408538: Create a Revise Tone Task Generator in LiftWing from Create a Tone Suggestion Generator in LiftWing to Create a Revise Tone Task Generator in LiftWing.
Nov 3 2025, 1:17 PM · Patch-For-Review, Machine-Learning-Team
achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

I think so, yes. If you have specific mock data in mind, a csv-formatted file should work.

Nov 3 2025, 10:25 AM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence
achou created P84613 kafkacat produce tone suggestions to cirrussearch.
Nov 3 2025, 10:17 AM

Oct 31 2025

achou added a comment to T408341: Q1 FY2025-26 Goal: Task generation engine for Revise Tone task.

Weekly Report

Oct 31 2025, 4:10 PM · OKR-Work, Goal, Machine-Learning-Team
achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

@Eevans I also wanted to follow up on the next step for this task.

Oct 31 2025, 3:38 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence
achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

We won't use a different source unit, so I think including page is unnecessary.

Could we go with page_paragraph_tone_scores? This is specifically representing paragraphs belonging to MediaWiki pages, and the primary key very explicitly includes page_id.

Oct 31 2025, 12:02 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence

Oct 30 2025

achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

It’s good that we’re discussing this! I've learned a lot :)

Oct 30 2025, 7:01 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence

Oct 29 2025

achou added a comment to T401021: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task.

Trying to address the blocking ones:

Oct 29 2025, 5:51 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Data-Persistence-Design-Review, Revise-Tone-Structured-Task, OKR-Work, Machine-Learning-Team, Growth-Team, Data-Persistence
achou moved T408538: Create a Revise Tone Task Generator in LiftWing from Ready To Go to In Progress on the Machine-Learning-Team board.
Oct 29 2025, 11:43 AM · Patch-For-Review, Machine-Learning-Team
achou moved T408533: Initial task generation and ingestion to Cassandra and Search weight tags from Ready To Go to In Progress on the Machine-Learning-Team board.
Oct 29 2025, 11:43 AM · Discovery-Search (2025.10.20 - 2025.12.31), Machine-Learning-Team