Page MenuHomePhabricator

Snwachukwu (Sandra Ebele Nwachukwu)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Jan 6 2022, 11:29 AM (111 w, 1 d)
Availability
Available
LDAP User
Snwachukwu
MediaWiki User
SNwachukwu (WMF) [ Global Accounts ]

Recent Activity

Tue, Feb 6

Snwachukwu added a comment to T354552: [Maintenance] Migrate ReportUpdater browser queries to Airflow.

We added the following:

Tue, Feb 6, 7:42 PM · Data-Engineering (Sprint 9), Patch-For-Review

Sun, Feb 4

Snwachukwu moved T354692: [Data Quality] Implement basic data quality metrics for MW history from Next Up to In progress on the Data-Engineering (Sprint 8) board.
Sun, Feb 4, 12:15 AM · Data-Engineering (Sprint 9)
Snwachukwu moved T354552: [Maintenance] Migrate ReportUpdater browser queries to Airflow from In progress to In Review on the Data-Engineering (Sprint 8) board.
Sun, Feb 4, 12:14 AM · Data-Engineering (Sprint 9), Patch-For-Review

Jan 15 2024

Snwachukwu added a comment to T354552: [Maintenance] Migrate ReportUpdater browser queries to Airflow.

The suggested approach for this will be to use spark to run the queries after which result will be saved in the cluster. However, spark saves files in folder and we don't want to have different folders for each querry result. We want to put all the output files (report) in one location with is already rsynced to report server. Thus we would use our hdfsarchive operator to move the generated output from the spark output path to the final destination.
To start, we would migrate the queries in the browser folder first.

Jan 15 2024, 7:26 PM · Data-Engineering (Sprint 9), Patch-For-Review

Jan 10 2024

Snwachukwu moved T354552: [Maintenance] Migrate ReportUpdater browser queries to Airflow from Next Up to In progress on the Data-Engineering (Sprint 7) board.
Jan 10 2024, 7:54 PM · Data-Engineering (Sprint 9), Patch-For-Review
Snwachukwu moved T352670: [Iceberg Migration] Migrate browser_general tables to Iceberg from In progress to In Review on the Data-Engineering (Sprint 7) board.
Jan 10 2024, 7:54 PM · Data-Engineering (Sprint 8)
Snwachukwu claimed T354552: [Maintenance] Migrate ReportUpdater browser queries to Airflow.
Jan 10 2024, 2:06 PM · Data-Engineering (Sprint 9), Patch-For-Review

Jan 8 2024

Snwachukwu moved T352670: [Iceberg Migration] Migrate browser_general tables to Iceberg from Next Up to In progress on the Data-Engineering (Sprint 6) board.
Jan 8 2024, 7:37 PM · Data-Engineering (Sprint 8)
Snwachukwu added a comment to T352670: [Iceberg Migration] Migrate browser_general tables to Iceberg.

Update a patch containing 2 hql files required to create and update iceberg version of browser_general tables respectively.

Jan 8 2024, 7:36 PM · Data-Engineering (Sprint 8)

Jan 2 2024

Snwachukwu claimed T352670: [Iceberg Migration] Migrate browser_general tables to Iceberg.
Jan 2 2024, 6:54 PM · Data-Engineering (Sprint 8)

Jun 22 2023

Snwachukwu moved T337335: Upgrade Presto to release that aligns with Iceberg 1.2.1 from In Review to Ready to Deploy on the Data Pipelines (Sprint 14) board.
Jun 22 2023, 4:13 PM · Data Pipelines (Sprint 14), Data-Platform-SRE
Snwachukwu moved T337335: Upgrade Presto to release that aligns with Iceberg 1.2.1 from In Progress to In Review on the Data Pipelines (Sprint 14) board.
Jun 22 2023, 4:13 PM · Data Pipelines (Sprint 14), Data-Platform-SRE
Snwachukwu moved T318346: Add Python Linter Checks to CI from In Review to Ready to Deploy on the Data Pipelines (Sprint 14) board.
Jun 22 2023, 4:07 PM · Patch-For-Review, Data Pipelines (Sprint 14), Data-Engineering-Planning

Jun 21 2023

Snwachukwu moved T335308: Migrate Refine to Spark 3 from Ready to Deploy to Done on the Data Pipelines (Sprint 14) board.
Jun 21 2023, 4:02 PM · Data Pipelines (Sprint 14)

Jun 19 2023

Snwachukwu added a comment to T335308: Migrate Refine to Spark 3.

RefineSanitize still was failing on Friday, so Ben and Joseph helped to successfully revert RefineSanitize to spark2 by reverting the refinery jar version from v0.2.16 back to v0.1.15. https://gerrit.wikimedia.org/r/930765 to allow it work during the weekend.

Jun 19 2023, 2:50 PM · Data Pipelines (Sprint 14)

Jun 12 2023

Snwachukwu claimed T336745: Split Cassandra Airflow dags by dataset.
Jun 12 2023, 2:13 PM · Data Engineering and Event Platform Team (Sprint 0), Data Pipelines (Sprint 14)
Snwachukwu moved T335308: Migrate Refine to Spark 3 from In Review to Ready to Deploy on the Data Pipelines (Sprint 14) board.
Jun 12 2023, 2:04 PM · Data Pipelines (Sprint 14)

Jun 7 2023

Snwachukwu closed T338343: Wikimedia-event-utilities jenkins build failure, a subtask of T337421: Fix wikimedia-event-utilities Guava dependencies issues , as Resolved.
Jun 7 2023, 5:21 PM · Event-Platform (Sprint 14 B), Data-Engineering, Data Pipelines
Snwachukwu closed T338343: Wikimedia-event-utilities jenkins build failure as Resolved.
Jun 7 2023, 5:21 PM · Continuous-Integration-Config, Patch-For-Review, ci-test-error, Event-Platform (Sprint 14 B), Data-Engineering, Data Pipelines
Snwachukwu added a comment to T338343: Wikimedia-event-utilities jenkins build failure.

Thank you @hashar

Jun 7 2023, 5:20 PM · Continuous-Integration-Config, Patch-For-Review, ci-test-error, Event-Platform (Sprint 14 B), Data-Engineering, Data Pipelines
Snwachukwu created T338343: Wikimedia-event-utilities jenkins build failure.
Jun 7 2023, 4:34 PM · Continuous-Integration-Config, Patch-For-Review, ci-test-error, Event-Platform (Sprint 14 B), Data-Engineering, Data Pipelines

May 24 2023

Snwachukwu added a comment to T335308: Migrate Refine to Spark 3.

Currently getting the error below when running refine jobs with spark3: Yet to get a solution for this but will update once I get one.

May 24 2023, 5:02 PM · Data Pipelines (Sprint 14)

May 16 2023

Snwachukwu created T336771: HDFS utils on Airflow to handle actions on hdfs files.
May 16 2023, 2:45 PM · Data-Engineering, Data Pipelines

May 11 2023

Snwachukwu claimed T335308: Migrate Refine to Spark 3.
May 11 2023, 4:00 PM · Data Pipelines (Sprint 14)

May 10 2023

Snwachukwu moved T334104: [Airflow] Migrate pageview-related Druid loading Oozie jobs from In Review to Ready to Deploy on the Data Pipelines (Sprint 12) board.
May 10 2023, 3:48 PM · Data Pipelines (Sprint 12), Patch-For-Review

Apr 17 2023

Snwachukwu moved T331028: NEW FEATURE REQUEST: <Update webrequest derived tables to use the new column for referer data> from Ready to Deploy to Done on the Data Pipelines (Sprint 11) board.
Apr 17 2023, 5:28 PM · Data Pipelines (Sprint 11), Product-Analytics
Snwachukwu moved T334224: Add referer_data column to pageview_hourly and pageview_daily druid tables from Ready to Deploy to Done on the Data Pipelines (Sprint 11) board.
Apr 17 2023, 5:28 PM · Data Pipelines (Sprint 11), Product-Analytics

Apr 14 2023

kzimmerman awarded T331028: NEW FEATURE REQUEST: <Update webrequest derived tables to use the new column for referer data> a Like token.
Apr 14 2023, 9:11 PM · Data Pipelines (Sprint 11), Product-Analytics

Apr 13 2023

Snwachukwu added a comment to T334493: analytics/refinery deployment broken at refinery-deploy-to-hdfs.

I got similar error when deploying analytics refinery:

Apr 13 2023, 5:31 PM · Data-Platform-SRE
Snwachukwu moved T334104: [Airflow] Migrate pageview-related Druid loading Oozie jobs from Next Up to In Progress on the Data Pipelines (Sprint 11) board.
Apr 13 2023, 2:43 PM · Data Pipelines (Sprint 12), Patch-For-Review
Snwachukwu moved T331028: NEW FEATURE REQUEST: <Update webrequest derived tables to use the new column for referer data> from In Progress to In Review on the Data Pipelines (Sprint 11) board.
Apr 13 2023, 2:43 PM · Data Pipelines (Sprint 11), Product-Analytics
Snwachukwu claimed T334104: [Airflow] Migrate pageview-related Druid loading Oozie jobs.
Apr 13 2023, 2:06 PM · Data Pipelines (Sprint 12), Patch-For-Review

Apr 12 2023

Snwachukwu set the point value for T334224: Add referer_data column to pageview_hourly and pageview_daily druid tables to 3.
Apr 12 2023, 3:22 PM · Data Pipelines (Sprint 11), Product-Analytics

Apr 11 2023

Snwachukwu moved T334120: Update Hive pageview_hourly with referer_name from Ready to Deploy to Done on the Data Pipelines (Sprint 11) board.
Apr 11 2023, 7:28 PM · Data Pipelines (Sprint 11), Product-Analytics

Apr 6 2023

Snwachukwu moved T334224: Add referer_data column to pageview_hourly and pageview_daily druid tables from Next Up to In Review on the Data Pipelines (Sprint 11) board.
Apr 6 2023, 2:48 PM · Data Pipelines (Sprint 11), Product-Analytics
Snwachukwu created T334224: Add referer_data column to pageview_hourly and pageview_daily druid tables.
Apr 6 2023, 2:40 PM · Data Pipelines (Sprint 11), Product-Analytics
Snwachukwu moved T334120: Update Hive pageview_hourly with referer_name from Next Up to In Review on the Data Pipelines (Sprint 11) board.
Apr 6 2023, 1:29 PM · Data Pipelines (Sprint 11), Product-Analytics

Apr 5 2023

Snwachukwu updated the task description for T334120: Update Hive pageview_hourly with referer_name.
Apr 5 2023, 5:09 PM · Data Pipelines (Sprint 11), Product-Analytics
Snwachukwu created T334120: Update Hive pageview_hourly with referer_name.
Apr 5 2023, 5:08 PM · Data Pipelines (Sprint 11), Product-Analytics

Mar 30 2023

Snwachukwu moved T330203: mediawiki-wikitext job migration from Ready to Deploy to Done on the Data Pipelines (sprint 10) board.
Mar 30 2023, 6:34 PM · Data Pipelines (sprint 10)
Snwachukwu moved T330200: mediawiki-history-check-denormalize job migration from Ready to Deploy to Done on the Data Pipelines (sprint 10) board.
Mar 30 2023, 5:58 PM · Data Pipelines (sprint 10)

Mar 21 2023

Snwachukwu moved T330203: mediawiki-wikitext job migration from Next Up to In Progress on the Data Pipelines (sprint 10) board.
Mar 21 2023, 5:11 PM · Data Pipelines (sprint 10)

Feb 22 2023

Snwachukwu moved T330200: mediawiki-history-check-denormalize job migration from Next Up to In Progress on the Data Pipelines (Sprint 11) board.
Feb 22 2023, 3:25 PM · Data Pipelines (sprint 10)
Snwachukwu moved T330201: mediawiki-history-denormalize job migration from Next Up to In Progress on the Data Pipelines (Sprint 11) board.
Feb 22 2023, 3:25 PM · Data Pipelines (sprint 10)
Snwachukwu claimed T330203: mediawiki-wikitext job migration.
Feb 22 2023, 3:20 PM · Data Pipelines (sprint 10)
Snwachukwu claimed T330202: mediawiki-history-reduced job migration.
Feb 22 2023, 3:20 PM · Data Pipelines (Sprint 14)
Snwachukwu claimed T330201: mediawiki-history-denormalize job migration.
Feb 22 2023, 3:19 PM · Data Pipelines (sprint 10)

Feb 21 2023

Snwachukwu claimed T330200: mediawiki-history-check-denormalize job migration.
Feb 21 2023, 6:50 PM · Data Pipelines (sprint 10)

Feb 16 2023

Snwachukwu moved T327074: Update wmf.webrequest table to use a new column for referer data. from In Review to Ready to Deploy on the Data Pipelines (Sprint 08) board.
Feb 16 2023, 1:43 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests
Snwachukwu moved T329307: Update the Regular Expression to remove github.io from Referrer Tracking. from In Review to Ready to Deploy on the Data Pipelines (Sprint 08) board.
Feb 16 2023, 1:43 PM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Feb 15 2023

Snwachukwu moved T329307: Update the Regular Expression to remove github.io from Referrer Tracking. from Next Up to In Review on the Data Pipelines (Sprint 08) board.
Feb 15 2023, 2:56 PM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Feb 14 2023

Snwachukwu moved T327074: Update wmf.webrequest table to use a new column for referer data. from In Progress to In Review on the Data Pipelines (Sprint 08) board.
Feb 14 2023, 3:44 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests
Snwachukwu moved T326658: Document Impact of Jan 8&9 Traffic Data Loss from In Progress to In Review on the Data Pipelines (Sprint 08) board.
Feb 14 2023, 3:44 PM · Data Pipelines (Sprint 08), SRE, Traffic

Feb 13 2023

Snwachukwu added a comment to T326658: Document Impact of Jan 8&9 Traffic Data Loss.

See wikitech documentation here.

Feb 13 2023, 2:42 PM · Data Pipelines (Sprint 08), SRE, Traffic

Feb 9 2023

Snwachukwu added a comment to T326658: Document Impact of Jan 8&9 Traffic Data Loss.

Here is a google doc containing a documentation on the data loss

Feb 9 2023, 9:47 PM · Data Pipelines (Sprint 08), SRE, Traffic

Feb 8 2023

Snwachukwu moved T326658: Document Impact of Jan 8&9 Traffic Data Loss from Ready to In Progress on the Data Pipelines (Sprint 08) board.
Feb 8 2023, 3:14 PM · Data Pipelines (Sprint 08), SRE, Traffic

Feb 6 2023

Snwachukwu added a comment to T327074: Update wmf.webrequest table to use a new column for referer data..

Regarding the new column, I like to get suggestions on the name to use for the new field. I am thinking referer_data. Anyone has a better name?

Feb 6 2023, 2:49 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests
Snwachukwu added a comment to T327074: Update wmf.webrequest table to use a new column for referer data..

@Mayakp.wiki I ran an analysis on the UDF which would be used to populate the data of the new field and posted the result in the parent ticket T309769 and there is a comment thread on it.

Feb 6 2023, 2:46 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Jan 31 2023

KinneretG awarded T309769: Expanding External Referrer Tracking a Like token.
Jan 31 2023, 1:27 PM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Jan 26 2023

Snwachukwu added a comment to T309769: Expanding External Referrer Tracking.

I ran the UDF on a day's data and extracted the top 1000 referer's for that day to show the impact of the GetRefererDataUDF on referers. You can check the spreadsheet and a little doc on it.

Jan 26 2023, 3:43 PM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Jan 23 2023

Snwachukwu added a comment to T326658: Document Impact of Jan 8&9 Traffic Data Loss.

Traffic Can you please confirm that there were cases of pages served in eqsin but not reported in webrequest logs.

Jan 23 2023, 4:59 PM · Data Pipelines (Sprint 08), SRE, Traffic
Snwachukwu added a project to T326658: Document Impact of Jan 8&9 Traffic Data Loss: Traffic.
Jan 23 2023, 3:37 PM · Data Pipelines (Sprint 08), SRE, Traffic
Snwachukwu added a comment to T326721: Strip 2FA from Wikitech account of Snwachukwu.

@taavi. done

Jan 23 2023, 1:33 PM · User-bd808, cloud-services-team, Trust-and-Safety
Snwachukwu added a comment to T326721: Strip 2FA from Wikitech account of Snwachukwu.

@bd808 and @Platonides . I have been now have access to cloud bastion. Here is the result.

Jan 23 2023, 11:11 AM · User-bd808, cloud-services-team, Trust-and-Safety

Jan 19 2023

Snwachukwu added a comment to T326721: Strip 2FA from Wikitech account of Snwachukwu.

@Platonides Here is the result when I run on a production host.

Jan 19 2023, 5:30 PM · User-bd808, cloud-services-team, Trust-and-Safety
Snwachukwu added a comment to T327074: Update wmf.webrequest table to use a new column for referer data..

@Mayakp.wiki We are introducing a new new column to wmf.webrequest table of a struct data type that would contain same data in existing referer_class column as well as the referer’s name. However the referer_class column won't be removed now. It would only be removed after all the downstream have been changed.

Jan 19 2023, 11:28 AM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Jan 18 2023

Snwachukwu added a comment to T326721: Strip 2FA from Wikitech account of Snwachukwu.

Before now I haven't ssh to any cloud or toolforge instance. Is there another verification method?

Jan 18 2023, 3:00 PM · User-bd808, cloud-services-team, Trust-and-Safety

Jan 17 2023

Snwachukwu added a comment to T326721: Strip 2FA from Wikitech account of Snwachukwu.

Please I have been unable to login to my wikitech account and do an important editing because of this issue. I would appreciate any form of assistance as this is urgent.

Jan 17 2023, 8:39 AM · User-bd808, cloud-services-team, Trust-and-Safety
Snwachukwu triaged T326721: Strip 2FA from Wikitech account of Snwachukwu as High priority.
Jan 17 2023, 8:36 AM · User-bd808, cloud-services-team, Trust-and-Safety

Jan 16 2023

Snwachukwu created T327074: Update wmf.webrequest table to use a new column for referer data..
Jan 16 2023, 2:46 PM · Patch-For-Review, Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Jan 12 2023

Snwachukwu added a project to T326721: Strip 2FA from Wikitech account of Snwachukwu: wikitech.wikimedia.org.
Jan 12 2023, 4:49 PM · User-bd808, cloud-services-team, Trust-and-Safety

Jan 11 2023

Snwachukwu added a comment to T326721: Strip 2FA from Wikitech account of Snwachukwu.

@Aklapper I am unable to 'ssh bastion.wmcloud.org' or ssh login.toolforge.org

Jan 11 2023, 12:45 PM · User-bd808, cloud-services-team, Trust-and-Safety
Snwachukwu added a project to T326721: Strip 2FA from Wikitech account of Snwachukwu: wikitech.wikimedia.org.
Jan 11 2023, 11:25 AM · User-bd808, cloud-services-team, Trust-and-Safety
Snwachukwu created T326721: Strip 2FA from Wikitech account of Snwachukwu.
Jan 11 2023, 11:24 AM · User-bd808, cloud-services-team, Trust-and-Safety

Dec 23 2022

nshahquinn-wmf awarded T309769: Expanding External Referrer Tracking a 100 token.
Dec 23 2022, 12:15 AM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Dec 20 2022

Snwachukwu added a comment to T309769: Expanding External Referrer Tracking.
  • Indeed this will alter the referer_class field as some rows previously labelled as external will now be labelled as external (media sites) class.
Dec 20 2022, 7:20 AM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Dec 13 2022

Snwachukwu added a comment to T309769: Expanding External Referrer Tracking.

In the current patch we have a updated our referer classifier to include "external (media sites)" class to represent the list of sites to track. This is in addition to the previous classes: unknown, internal, external (search engine) and external. The classifier would also identify the Name of the site if it's a search engine or a media site (eg Youtube, Facebook, etc.).
Next step:

  1. Test for performance and optimise to include caching if necessary.
  2. Create a new UDF that will Identify the Names of the search engine and media sites by using the referer classifier.
Dec 13 2022, 5:06 PM · Data Pipelines (Sprint 08), Metrics Platform Backlog, Foundational Technology Requests

Nov 15 2022

Snwachukwu added a comment to T306895: Write dedicated cassandra authorization code to read password from file when loading.

We now have a custom AuthConfFactory that will be passed as a parameter to the job using spark.cassandra.auth.conf.factory. This should be deployed today. @Ottomata @Eevans The next step would be to have the file containing the Cassandra password loaded to HDFS.

Nov 15 2022, 1:28 PM · Data Pipelines (Sprint 04), Data-Engineering-Planning, Patch-For-Review, Cassandra

Nov 14 2022

Snwachukwu moved T306895: Write dedicated cassandra authorization code to read password from file when loading from In Review to Ready to Deploy on the Data Pipelines (Sprint 04) board.
Nov 14 2022, 1:41 PM · Data Pipelines (Sprint 04), Data-Engineering-Planning, Patch-For-Review, Cassandra

Nov 1 2022

Snwachukwu moved T306895: Write dedicated cassandra authorization code to read password from file when loading from In Progress to In Review on the Data Pipelines (Sprint 03) board.
Nov 1 2022, 5:01 PM · Data Pipelines (Sprint 04), Data-Engineering-Planning, Patch-For-Review, Cassandra

Sep 30 2022

Snwachukwu moved T318949: Update projectview dags to be backward compatible with HdfsArchiver Operator from In Progress to In Review on the Data Pipelines (Sprint 02) board.
Sep 30 2022, 12:15 PM · Data Pipelines (Sprint 02)

Sep 29 2022

Snwachukwu set the point value for T318949: Update projectview dags to be backward compatible with HdfsArchiver Operator to 2.
Sep 29 2022, 4:41 PM · Data Pipelines (Sprint 02)
Snwachukwu created T318949: Update projectview dags to be backward compatible with HdfsArchiver Operator.
Sep 29 2022, 4:41 PM · Data Pipelines (Sprint 02)

Sep 8 2022

Snwachukwu edited projects for T310542: [Airflow] Refactor HDFSArchiveOperator to run in Skein, added: Data Pipelines (Sprint 01); removed Data Pipelines.
Sep 8 2022, 4:08 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning

Sep 5 2022

Snwachukwu moved T317054: Add HdfsArchiver job to hdfs-tools from Ready to Deploy to In Review on the Data Pipelines (Sprint 00) board.
Sep 5 2022, 4:23 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning
Snwachukwu moved T317054: Add HdfsArchiver job to hdfs-tools from In Review to Ready to Deploy on the Data Pipelines (Sprint 00) board.
Sep 5 2022, 4:15 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning
Snwachukwu moved T317054: Add HdfsArchiver job to hdfs-tools from Ready to In Review on the Data Pipelines (Sprint 00) board.
Sep 5 2022, 4:15 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning
Snwachukwu added a comment to T317054: Add HdfsArchiver job to hdfs-tools .

There were 2 fix made in this repo:

Sep 5 2022, 4:12 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning
Snwachukwu updated the task description for T317054: Add HdfsArchiver job to hdfs-tools .
Sep 5 2022, 4:08 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning
Snwachukwu created T317054: Add HdfsArchiver job to hdfs-tools .
Sep 5 2022, 4:05 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning

Aug 30 2022

Snwachukwu added a comment to T315613: Add Sound Logo site to Matomo dashboard and provide Communications department account with access.

Sure @JArguello-WMF I can take it. Would sync with @BTullis.

Aug 30 2022, 4:13 PM · Data-Engineering-Planning, WMF-Communications

Aug 29 2022

Snwachukwu moved T315580: Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0 from In Progress to In Review on the Data Pipelines (Sprint 00) board.
Aug 29 2022, 4:16 PM · Data Pipelines (Sprint 11), Vuln-VulnComponent, SecTeam-Processed, Data-Engineering-Planning
Snwachukwu edited projects for T315580: Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0, added: Data Pipelines (Sprint 00); removed Data Pipelines.
Aug 29 2022, 4:16 PM · Data Pipelines (Sprint 11), Vuln-VulnComponent, SecTeam-Processed, Data-Engineering-Planning

Aug 25 2022

Snwachukwu moved T310542: [Airflow] Refactor HDFSArchiveOperator to run in Skein from In Progress to Blocked/Paused on the Data Pipelines (Sprint 00) board.
Aug 25 2022, 4:03 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning

Aug 24 2022

Snwachukwu updated the task description for T316120: Repurpose Refinery Tools module.
Aug 24 2022, 3:38 PM · Data-Engineering-Planning
Snwachukwu created T316120: Repurpose Refinery Tools module.
Aug 24 2022, 2:38 PM · Data-Engineering-Planning

Aug 22 2022

Snwachukwu added a comment to T310542: [Airflow] Refactor HDFSArchiveOperator to run in Skein.

The HdfsArchiver Operator fails to run successfully on skein because we do not have Scala 2.12.10 version installed on the workers yet. For now, the Scala version 2.12.10 is provided by the spark 3 assembly which can only be found on an-launcher.

Aug 22 2022, 12:33 PM · Data Pipelines (Sprint 02), Data-Engineering-Planning

Aug 18 2022

Snwachukwu added a project to T315580: Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0: Data-Engineering-Planning.
Aug 18 2022, 4:11 PM · Data Pipelines (Sprint 11), Vuln-VulnComponent, SecTeam-Processed, Data-Engineering-Planning
Snwachukwu added a subtask for T309552: Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow: T315580: Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0.
Aug 18 2022, 4:05 PM · Data Pipelines (Sprint 07), Data-Engineering-Planning
Snwachukwu added a parent task for T315580: Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0: T309552: Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow.
Aug 18 2022, 4:05 PM · Data Pipelines (Sprint 11), Vuln-VulnComponent, SecTeam-Processed, Data-Engineering-Planning