Page MenuHomePhabricator
Feed Advanced Search

Jan 30 2023

EChetty moved T323597: [M] Exclude date format topics from section topics pipeline from Backlog to Structured Data (Tracking) on the Data Pipelines board.
Jan 30 2023, 5:12 PM · Data Pipelines, Structured-Data-Backlog (Current Work), Section-Topics
EChetty moved T327458: Document Traffic Datasets in Datahub from Next Up to Blocked/Paused on the Data Pipelines (Sprint 07) board.
Jan 30 2023, 5:11 PM · Data Pipelines (Sprint 11), Data-Catalog
EChetty reassigned T324995: Include EU Registered Country in the canonical country database from nshahquinn-wmf to mforns.
Jan 30 2023, 5:10 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T326193: Airflow upgrade (refactor deb creation + version bump + switch to PostgreSQL) from In Progress to Blocked/Paused on the Data Pipelines (Sprint 07) board.
Jan 30 2023, 5:10 PM · Data Pipelines
EChetty moved T309552: Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow from In Review to Done on the Data Pipelines (Sprint 07) board.
Jan 30 2023, 5:06 PM · Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T309996: [Airflow] Build Druid Operator from In Review to Ready to Deploy on the Data Pipelines (Sprint 07) board.
Jan 30 2023, 5:02 PM · Data Pipelines (Sprint 08), Data-Engineering-Planning
EChetty moved T327969: null shown in the user profile dropdown in datahub from Next Up to Blocked/Paused on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 30 2023, 1:43 PM · Data-Platform-SRE
EChetty moved T327884: Datahub user records are not being created after login from Blocked/Paused to In Progress on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 30 2023, 1:36 PM · Data-Platform-SRE

Jan 26 2023

EChetty triaged T328049: Investigate the effects of IP Masking on Data Eng systems as High priority.
Jan 26 2023, 5:21 PM · Data Pipelines (Sprint 12)
EChetty updated the task description for T328049: Investigate the effects of IP Masking on Data Eng systems.
Jan 26 2023, 5:21 PM · Data Pipelines (Sprint 12)
EChetty created T328049: Investigate the effects of IP Masking on Data Eng systems.
Jan 26 2023, 5:20 PM · Data Pipelines (Sprint 12)
EChetty moved T318832: Modify the JS Client to be used non MW powered sites from Ready to Deploy to Sign Off on the Metrics Platform Backlog (Metrics Platform Kanban) board.
Jan 26 2023, 3:39 PM · Metrics Platform Backlog, MW-1.41-notes (1.41.0-wmf.4; 2023-04-10)
EChetty moved T325857: Requesting Kerberos identity for Hxi-ctr from Backlog to Ops Week on the Data-Engineering-Planning board.
Jan 26 2023, 2:06 PM · Shared-Data-Infrastructure, Data-Engineering-Planning
EChetty edited projects for T325857: Requesting Kerberos identity for Hxi-ctr, added: Data-Engineering-Planning; removed Data-Engineering.
Jan 26 2023, 2:05 PM · Shared-Data-Infrastructure, Data-Engineering-Planning
EChetty changed the status of T328026: ***New Tasks Above*** from Open to Stalled.
Jan 26 2023, 2:05 PM · Data-Engineering
EChetty created T328026: ***New Tasks Above***.
Jan 26 2023, 2:04 PM · Data-Engineering
EChetty moved T326598: Ingest feature Hive schema into datahub from Estimated/Discussed to Data Catalog on the Data-Engineering-Planning board.
Jan 26 2023, 2:02 PM · Data-Engineering, Data-Catalog
EChetty moved T326598: Ingest feature Hive schema into datahub from Backlog to Estimated/Discussed on the Data-Engineering-Planning board.
Jan 26 2023, 2:02 PM · Data-Engineering, Data-Catalog

Jan 25 2023

EChetty moved T324995: Include EU Registered Country in the canonical country database from Blocked/Paused to In Progress on the Data Pipelines (Sprint 07) board.
Jan 25 2023, 5:06 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T325185: [Airflow] Implement a NotebookOperator from Backlog to To be discussed /To be estimated on the Data Pipelines board.
Jan 25 2023, 1:52 PM · Data-Engineering, Data Pipelines
EChetty closed T302514: Modify HiveToDruid Job , a subtask of T302263: The network_internal druid load job fails if data is not present, as Declined.
Jan 25 2023, 1:51 PM · Data-Engineering, Data-Engineering-Kanban
EChetty closed T302514: Modify HiveToDruid Job as Declined.
Jan 25 2023, 1:51 PM · Data-Engineering-Planning, Platform Engineering
EChetty moved T323642: Spark Streaming Dumps POC: Backfill metadata table from Backlog to Sprint 07 on the Event-Platform board.
Jan 25 2023, 1:49 PM · Data-Engineering, Data Pipelines
EChetty moved T323642: Spark Streaming Dumps POC: Backfill metadata table from Backlog to Event Platform on the Data-Engineering-Planning board.
Jan 25 2023, 1:49 PM · Data-Engineering, Data Pipelines
EChetty moved T327027: Massive spike in pageviews for a few enwiki pages beginning with "Index" from Backlog to Pipelines on the Data-Engineering-Planning board.
Jan 25 2023, 1:49 PM · Product-Analytics, Data Pipelines, Data-Engineering-Planning, Pageviews-Anomaly
EChetty edited projects for T327027: Massive spike in pageviews for a few enwiki pages beginning with "Index", added: Data-Engineering-Planning; removed Data-Engineering.
Jan 25 2023, 1:47 PM · Product-Analytics, Data Pipelines, Data-Engineering-Planning, Pageviews-Anomaly
EChetty moved T327799: Datahub errors in staging-codfw from Next Up to In Progress on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 25 2023, 1:41 PM · Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)), Data-Engineering
EChetty raised the priority of T323783: Add an-presto10[06-15] to the presto cluster from Medium to High.
Jan 25 2023, 1:41 PM · Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)), Data-Engineering-Planning
EChetty added a comment to T323783: Add an-presto10[06-15] to the presto cluster.

TB: End of this sprint. (End of next week)

Jan 25 2023, 1:40 PM · Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)), Data-Engineering-Planning
EChetty changed the point value for T323783: Add an-presto10[06-15] to the presto cluster from 1 to 5.
Jan 25 2023, 1:38 PM · Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)), Data-Engineering-Planning
EChetty moved T311738: [Iceberg] Debianize and install iceberg support for Spark, Presto, and optionally Hive from In Progress to Blocked/Paused on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 25 2023, 1:37 PM · Shared-Data-Infrastructure, Data Pipelines, Data-Engineering-Planning
EChetty moved T327884: Datahub user records are not being created after login from In Progress to Blocked/Paused on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 25 2023, 1:36 PM · Data-Platform-SRE
EChetty moved T324011: SPIKE: Spin up a Test Trino instance (Evaluate Trino) from In Progress to Blocked/Paused on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 25 2023, 1:36 PM · Data-Platform-SRE
EChetty moved T324995: Include EU Registered Country in the canonical country database from In Review to Blocked/Paused on the Data Pipelines (Sprint 07) board.
Jan 25 2023, 12:53 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning

Jan 23 2023

EChetty moved T326195: Edit puppet code to provide Airflow the PostgreSQL connection from In Review to In Progress on the Data Pipelines (Sprint 07) board.
Jan 23 2023, 4:43 PM · Data Pipelines (Sprint 11)
EChetty moved T318299: Superset Date Filter fix needed from In Progress to In Review on the Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)) board.
Jan 23 2023, 1:41 PM · Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-07)), Data-Engineering, Product-Analytics (Kanban)
EChetty moved T324017: Set up Spark SQL Server from EQ2 Kanban (Sprints 04-07) to Estimated/Discussed on the Shared-Data-Infrastructure board.
Jan 23 2023, 1:32 PM · Data-Platform-SRE
EChetty moved T324019: SPIKE- DBT and Airflow from EQ2 Kanban (Sprints 04-07) to Estimated/Discussed on the Shared-Data-Infrastructure board.
Jan 23 2023, 1:32 PM · Shared-Data-Infrastructure
EChetty moved T327262: DSE Experiment - User Story 4 (Machine Learning Use Case) from To be discussed to EQ2 Kanban (Sprints 04-07) on the Shared-Data-Infrastructure board.
Jan 23 2023, 1:31 PM · Shared-Data-Infrastructure, Epic
EChetty moved T327259: Support PersistentVolumeClaim objects on dse-k8s cluster from To be discussed to EQ2 Kanban (Sprints 04-07) on the Shared-Data-Infrastructure board.
Jan 23 2023, 1:31 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)
EChetty moved T327258: DSE Experiment - User Story 2 (Make Compute available) from To be discussed to EQ2 Kanban (Sprints 04-07) on the Shared-Data-Infrastructure board.
Jan 23 2023, 1:31 PM · Shared-Data-Infrastructure, Epic
EChetty moved T327257: DSE Experiment - User Story 1 (Address Kerberos) from To be discussed to EQ2 Kanban (Sprints 04-07) on the Shared-Data-Infrastructure board.
Jan 23 2023, 1:31 PM · Shared-Data-Infrastructure, Epic

Jan 19 2023

EChetty moved T315580: Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0 from In Review to In Progress on the Data Pipelines (Sprint 07) board.
Jan 19 2023, 5:06 PM · Data Pipelines (Sprint 11), Vuln-VulnComponent, SecTeam-Processed, Data-Engineering-Planning
EChetty moved T324995: Include EU Registered Country in the canonical country database from Ready to Deploy to In Review on the Data Pipelines (Sprint 07) board.
Jan 19 2023, 5:05 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T324995: Include EU Registered Country in the canonical country database from Done to Ready to Deploy on the Data Pipelines (Sprint 07) board.
Jan 19 2023, 5:05 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T324995: Include EU Registered Country in the canonical country database from Ready to Deploy to Done on the Data Pipelines (Sprint 07) board.
Jan 19 2023, 5:05 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T324995: Include EU Registered Country in the canonical country database from In Review to Ready to Deploy on the Data Pipelines (Sprint 07) board.
Jan 19 2023, 5:05 PM · Product-Analytics (Kanban), Data Pipelines (Sprint 07), Data-Engineering-Planning
EChetty moved T325240: Re-enable package cycle check in Java MPC from Next Up to In Progress on the Metrics Platform Backlog (Metrics Platform Kanban) board.
Jan 19 2023, 3:34 PM · Metrics Platform Backlog (Metrics Platform Kanban)
EChetty moved T318934: Extend the Java MPC to detect session inactivity from In Progress to In Review on the Metrics Platform Backlog (Metrics Platform Kanban) board.
Jan 19 2023, 3:34 PM · Patch-For-Review, Metrics Platform Backlog (Metrics Platform Kanban)
EChetty moved T318934: Extend the Java MPC to detect session inactivity from Blocked/Paused to In Progress on the Metrics Platform Backlog (Metrics Platform Kanban) board.
Jan 19 2023, 3:34 PM · Patch-For-Review, Metrics Platform Backlog (Metrics Platform Kanban)
EChetty moved T318934: Extend the Java MPC to detect session inactivity from In Progress to Blocked/Paused on the Metrics Platform Backlog (Metrics Platform Kanban) board.
Jan 19 2023, 3:34 PM · Patch-For-Review, Metrics Platform Backlog (Metrics Platform Kanban)

Jan 18 2023

EChetty moved T326330: Update sqoop for CheckUser table from Ready to Ready to Deploy on the Data Pipelines (Sprint 07) board.
Jan 18 2023, 5:13 PM · Data Pipelines (Sprint 07), Data-Engineering-Planning, Patch-For-Review
EChetty moved T324483: [Migration] Pageview - Learning from Next Up to In Progress on the Data Pipelines (Sprint 07) board.
Jan 18 2023, 5:12 PM · Data Pipelines (Sprint 08)
EChetty set the point value for T327074: Update wmf.webrequest table to use a new column for referer data. to 3.
Jan 18 2023, 5:11 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests
EChetty moved T326195: Edit puppet code to provide Airflow the PostgreSQL connection from In Progress to In Review on the Data Pipelines (Sprint 07) board.
Jan 18 2023, 5:09 PM · Data Pipelines (Sprint 11)
EChetty moved T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph from Backlog to Epics on the Shared-Data-Infrastructure board.
Jan 18 2023, 12:18 PM · Data-Platform-SRE, Epic, Foundational Technology Requests
EChetty moved T327262: DSE Experiment - User Story 4 (Machine Learning Use Case) from Backlog to To be discussed on the Shared-Data-Infrastructure board.
Jan 18 2023, 12:18 PM · Shared-Data-Infrastructure, Epic
EChetty moved T327259: Support PersistentVolumeClaim objects on dse-k8s cluster from Backlog to To be discussed on the Shared-Data-Infrastructure board.
Jan 18 2023, 12:18 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)
EChetty moved T327258: DSE Experiment - User Story 2 (Make Compute available) from Backlog to To be discussed on the Shared-Data-Infrastructure board.
Jan 18 2023, 12:18 PM · Shared-Data-Infrastructure, Epic
EChetty moved T327257: DSE Experiment - User Story 1 (Address Kerberos) from Backlog to To be discussed on the Shared-Data-Infrastructure board.
Jan 18 2023, 12:18 PM · Shared-Data-Infrastructure, Epic
EChetty added a parent task for T324660: Install Ceph Cluster for Data Engineering: T327259: Support PersistentVolumeClaim objects on dse-k8s cluster.
Jan 18 2023, 12:15 PM · Data-Platform-SRE, Epic
EChetty added a subtask for T327259: Support PersistentVolumeClaim objects on dse-k8s cluster: T324660: Install Ceph Cluster for Data Engineering.
Jan 18 2023, 12:15 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)
EChetty added a parent task for T318712: Enable spark jobs on the dse-k8s cluster via the spark-operator: T327258: DSE Experiment - User Story 2 (Make Compute available).
Jan 18 2023, 12:14 PM · Data-Platform-SRE, Foundational Technology Requests, Epic
EChetty added a subtask for T327258: DSE Experiment - User Story 2 (Make Compute available): T318712: Enable spark jobs on the dse-k8s cluster via the spark-operator.
Jan 18 2023, 12:14 PM · Shared-Data-Infrastructure, Epic
EChetty changed the status of T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph from Open to In Progress.
Jan 18 2023, 12:14 PM · Data-Platform-SRE, Epic, Foundational Technology Requests
EChetty added projects to T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph: Shared-Data-Infrastructure, Epic.
Jan 18 2023, 12:11 PM · Data-Platform-SRE, Epic, Foundational Technology Requests
EChetty added a parent task for T327257: DSE Experiment - User Story 1 (Address Kerberos): T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph.
Jan 18 2023, 12:11 PM · Shared-Data-Infrastructure, Epic
EChetty added a parent task for T327258: DSE Experiment - User Story 2 (Make Compute available): T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph.
Jan 18 2023, 12:11 PM · Shared-Data-Infrastructure, Epic
EChetty added a parent task for T327259: Support PersistentVolumeClaim objects on dse-k8s cluster: T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph.
Jan 18 2023, 12:11 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)
EChetty added a parent task for T327262: DSE Experiment - User Story 4 (Machine Learning Use Case): T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph.
Jan 18 2023, 12:11 PM · Shared-Data-Infrastructure, Epic
EChetty added subtasks for T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph: T327257: DSE Experiment - User Story 1 (Address Kerberos), T327258: DSE Experiment - User Story 2 (Make Compute available), T327259: Support PersistentVolumeClaim objects on dse-k8s cluster, T327262: DSE Experiment - User Story 4 (Machine Learning Use Case).
Jan 18 2023, 12:11 PM · Data-Platform-SRE, Epic, Foundational Technology Requests
EChetty claimed T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph.
Jan 18 2023, 12:11 PM · Data-Platform-SRE, Epic, Foundational Technology Requests
EChetty created T327267: Create a DSE Kubernetes cluster with support for persistent storage from Ceph.
Jan 18 2023, 12:10 PM · Data-Platform-SRE, Epic, Foundational Technology Requests
EChetty changed the status of T318712: Enable spark jobs on the dse-k8s cluster via the spark-operator from Open to In Progress.
Jan 18 2023, 11:56 AM · Data-Platform-SRE, Foundational Technology Requests, Epic
EChetty changed the status of T321702: <APP: Commons> Wikimedia Israel GLAMs Analytics Dashboard Support from Open to In Progress.
Jan 18 2023, 11:56 AM · API Platform, Foundational Technology Requests
EChetty updated the task description for T327262: DSE Experiment - User Story 4 (Machine Learning Use Case).
Jan 18 2023, 11:55 AM · Shared-Data-Infrastructure, Epic
EChetty created T327262: DSE Experiment - User Story 4 (Machine Learning Use Case).
Jan 18 2023, 11:54 AM · Shared-Data-Infrastructure, Epic
EChetty created T327259: Support PersistentVolumeClaim objects on dse-k8s cluster.
Jan 18 2023, 11:50 AM · Data-Platform-SRE (2024.04.15 - 2024.05.05)
EChetty created T327258: DSE Experiment - User Story 2 (Make Compute available).
Jan 18 2023, 11:48 AM · Shared-Data-Infrastructure, Epic
EChetty created T327257: DSE Experiment - User Story 1 (Address Kerberos).
Jan 18 2023, 11:46 AM · Shared-Data-Infrastructure, Epic

Jan 17 2023

EChetty moved T324485: [Airflow] Migrate Druid loading Oozie jobs - Parent task from Next Up to In Progress on the Data Pipelines (Sprint 07) board.
Jan 17 2023, 5:10 PM · Data Pipelines (Sprint 14)
EChetty moved T327074: Update wmf.webrequest table to use a new column for referer data. from Ready to In Progress on the Data Pipelines (Sprint 07) board.
Jan 17 2023, 5:08 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests
EChetty moved T309769: Expanding External Referrer Tracking from In Progress to In Review on the Data Pipelines (Sprint 07) board.
Jan 17 2023, 5:08 PM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests
EChetty moved T326195: Edit puppet code to provide Airflow the PostgreSQL connection from In Review to In Progress on the Data Pipelines (Sprint 07) board.
Jan 17 2023, 5:08 PM · Data Pipelines (Sprint 11)
EChetty moved T311229: Drop MediaViewer and MultimediaViewer* tables from Sprint 07 to Next Up (revisit every 2 sprints) on the Data Pipelines board.
Jan 17 2023, 5:01 PM · Data-Engineering, Data Pipelines
EChetty moved T326330: Update sqoop for CheckUser table from To be prioritised to Sprint 07 on the Data Pipelines board.
Jan 17 2023, 5:00 PM · Data Pipelines (Sprint 07), Data-Engineering-Planning, Patch-For-Review
EChetty added a comment to T323662: NEW FEATURE REQUEST: Dataset with active and non-active Wikis.

Do we have an existing definition of active we want to use here?

Jan 17 2023, 1:04 PM · Data-Engineering, Data Pipelines
EChetty moved T325306: Provide aggregated user device data per-country from To be prioritised to Discussed (Radar) on the Data Pipelines board.
Jan 17 2023, 12:53 PM · Data-Engineering
EChetty set the point value for T326195: Edit puppet code to provide Airflow the PostgreSQL connection to 3.
Jan 17 2023, 11:50 AM · Data Pipelines (Sprint 11)
EChetty moved T325181: Present "Notebooks in Airflow" solution to PA and discuss ownership of different steps from To be prioritised to Discussed (Radar) on the Data Pipelines board.
Jan 17 2023, 11:49 AM · Data-Engineering, Product-Analytics
EChetty claimed T325181: Present "Notebooks in Airflow" solution to PA and discuss ownership of different steps.
Jan 17 2023, 11:49 AM · Data-Engineering, Product-Analytics
EChetty triaged T325181: Present "Notebooks in Airflow" solution to PA and discuss ownership of different steps as High priority.
Jan 17 2023, 11:48 AM · Data-Engineering, Product-Analytics
EChetty set the point value for T327073: Write Airflow DAG to move the webrequest load job to airflow. to 3.
Jan 17 2023, 11:48 AM · Data Pipelines (Sprint 11), Patch-For-Review
EChetty moved T324482: [Migration] Oozie Migration jobs for Pageviews from To be prioritised to Sprint 07 on the Data Pipelines board.
Jan 17 2023, 11:48 AM · Data Pipelines (sprint 10), Patch-For-Review
EChetty set the point value for T324486: [Migration] migrate simple oozie jobs to 5.
Jan 17 2023, 11:47 AM · Data-Engineering, Data Pipelines
EChetty moved T311229: Drop MediaViewer and MultimediaViewer* tables from To be prioritised to Sprint 07 on the Data Pipelines board.
Jan 17 2023, 11:47 AM · Data-Engineering, Data Pipelines
EChetty moved T325103: Prune raw HDFS FSImages stored on HDFS from To be prioritised to Sprint 07 on the Data Pipelines board.
Jan 17 2023, 11:47 AM · Data-Engineering, Data Pipelines
EChetty moved T311229: Drop MediaViewer and MultimediaViewer* tables from To be discussed /To be estimated to To be prioritised on the Data Pipelines board.
Jan 17 2023, 11:47 AM · Data-Engineering, Data Pipelines
EChetty triaged T323662: NEW FEATURE REQUEST: Dataset with active and non-active Wikis as Medium priority.
Jan 17 2023, 11:47 AM · Data-Engineering, Data Pipelines
EChetty moved T323662: NEW FEATURE REQUEST: Dataset with active and non-active Wikis from To be discussed /To be estimated to To be prioritised on the Data Pipelines board.
Jan 17 2023, 11:47 AM · Data-Engineering, Data Pipelines