Page MenuHomePhabricator

ntsako (Ntsako)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Jan 4 2022, 7:07 PM (32 w, 2 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
NMaphophe (WMF) [ Global Accounts ]

Recent Activity

Tue, Aug 2

ntsako moved T305475: Milestone: Ingest and Transform Input Data from Up Next to In Development on the Equity-Landscape board.
Tue, Aug 2, 3:27 PM · Equity-Landscape, Data-Engineering

Tue, Jul 26

ntsako moved T313709: UDF to calculate the average of two or more column values from Backlog to In Development on the Equity-Landscape board.
Tue, Jul 26, 2:22 PM · Equity-Landscape, Data-Engineering

Mon, Jul 25

ntsako renamed T313709: UDF to calculate the average of two or more column values from UDF to calculate the average of two column values to UDF to calculate the average of two or more column values.
Mon, Jul 25, 10:55 AM · Equity-Landscape, Data-Engineering
ntsako created T313709: UDF to calculate the average of two or more column values.
Mon, Jul 25, 9:42 AM · Equity-Landscape, Data-Engineering

Jul 11 2022

ntsako added a comment to T309282: World Bank Data.

Update world bank series pulled from:

Jul 11 2022, 10:39 AM · Equity-Landscape, Data-Engineering

Jul 8 2022

ntsako added a comment to T306400: Develop hive folder plan.

Note to remember to format number on the grants csv and make sure that the data never overlaps between lines when the spreadsheet is updated.

Jul 8 2022, 10:43 AM · Equity-Landscape

Jun 22 2022

ntsako closed T310713: Get PySpark to with Airflow, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, as Resolved.
Jun 22 2022, 3:10 PM · Equity-Landscape, Data-Engineering
ntsako closed T310713: Get PySpark to with Airflow as Resolved.
Jun 22 2022, 3:10 PM · Equity-Landscape, Data-Engineering
ntsako renamed T310713: Get PySpark to with Airflow from Setup PySpark environment to Get PySpark to with Airflow.
Jun 22 2022, 3:06 PM · Equity-Landscape, Data-Engineering
KCVelaga_WMF awarded T310713: Get PySpark to with Airflow a Party Time token.
Jun 22 2022, 10:28 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T310713: Get PySpark to with Airflow.

Got PySpark to work with Airflow

Jun 22 2022, 10:20 AM · Equity-Landscape, Data-Engineering

Jun 15 2022

ntsako added a comment to T306400: Develop hive folder plan.

Data currently loaded on

Jun 15 2022, 3:42 PM · Equity-Landscape
ntsako added a comment to T309279: Population input metrics.

Loaded data onto

Jun 15 2022, 2:34 PM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309278: Overall Engagement input metric.

Loaded data onto:

Jun 15 2022, 2:33 PM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309275: Affiliates input metric.

Lua table is not to be used at the moment, use the csv for now

Jun 15 2022, 2:33 PM · Data-Engineering, Equity-Landscape
ntsako added a comment to T309274: Editorship Input Metrics.

Added table:

Jun 15 2022, 2:32 PM · Equity-Landscape, Data-Engineering
ntsako moved T310712: Load country data from Backlog to In Development on the Equity-Landscape board.
Jun 15 2022, 2:29 PM · Equity-Landscape, Data-Engineering
ntsako moved T310713: Get PySpark to with Airflow from Backlog to In Development on the Equity-Landscape board.
Jun 15 2022, 2:29 PM · Equity-Landscape, Data-Engineering
ntsako created T310713: Get PySpark to with Airflow.
Jun 15 2022, 2:28 PM · Equity-Landscape, Data-Engineering
ntsako added a comment to T310712: Load country data.

Country data loaded on

SELECT *
   FROM ntsako.country_meta_data
}
Jun 15 2022, 2:27 PM · Equity-Landscape, Data-Engineering
ntsako created T310712: Load country data.
Jun 15 2022, 2:27 PM · Equity-Landscape, Data-Engineering

Jun 7 2022

ntsako added a comment to T309275: Affiliates input metric.

Affiliate leadership data loaded on

SELECT *
  FROM ntsako.affiliate_leadership
WHERE year = 2021
Jun 7 2022, 3:00 PM · Data-Engineering, Equity-Landscape
ntsako added a comment to T309275: Affiliates input metric.

lua data loaded on

SELECT * 
  FROM ntsako.organizational_info
 WHERE year = 2021;
Jun 7 2022, 2:58 PM · Data-Engineering, Equity-Landscape

Jun 6 2022

ntsako moved T309278: Overall Engagement input metric from Up Next to In Development on the Equity-Landscape board.
Jun 6 2022, 2:04 PM · Equity-Landscape, Data-Engineering

Jun 1 2022

ntsako changed the status of T309275: Affiliates input metric, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
Jun 1 2022, 2:52 PM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309275: Affiliates input metric from Open to In Progress.
Jun 1 2022, 2:52 PM · Data-Engineering, Equity-Landscape
ntsako moved T309275: Affiliates input metric from Up Next to In Development on the Equity-Landscape board.
Jun 1 2022, 2:52 PM · Data-Engineering, Equity-Landscape

May 31 2022

ntsako added a comment to T309279: Population input metrics.

Overall Engagement (Percentile) and Total Population Presence*Growth Percentile depend on Affilate and Overall Engagement being completed

May 31 2022, 5:07 PM · Equity-Landscape, Data-Engineering
ntsako moved T305473: Milestone: Input Data Models Complete. from Up Next to In Development on the Equity-Landscape board.
May 31 2022, 2:04 PM · Equity-Landscape, Data-Engineering

May 27 2022

ntsako moved T306618: Editorship Metrics Transformation from Backlog to Up Next on the Equity-Landscape board.
May 27 2022, 2:45 PM · Equity-Landscape, Data-Engineering
ntsako moved T306622: Overall Engagement Metric (Transformation) from Backlog to Up Next on the Equity-Landscape board.
May 27 2022, 2:45 PM · Equity-Landscape, Data-Engineering
ntsako moved T305477: Milestone: Dashboard Interaction Map Complete from Backlog to Up Next on the Equity-Landscape board.
May 27 2022, 2:44 PM · Equity-Landscape, Data-Engineering
ntsako moved T305477: Milestone: Dashboard Interaction Map Complete from Up Next to Backlog on the Equity-Landscape board.
May 27 2022, 2:44 PM · Equity-Landscape, Data-Engineering
ntsako moved T306614: Transformations Flowchart from Up Next to Backlog on the Equity-Landscape board.
May 27 2022, 2:44 PM · Equity-Landscape, Data-Engineering
ntsako moved T306618: Editorship Metrics Transformation from Up Next to Backlog on the Equity-Landscape board.
May 27 2022, 2:44 PM · Equity-Landscape, Data-Engineering
ntsako moved T306624: Population Metrics Transformation from In Development to Backlog on the Equity-Landscape board.
May 27 2022, 2:44 PM · Equity-Landscape, Data-Engineering
ntsako moved T306624: Population Metrics Transformation from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309275: Affiliates input metric from Backlog to Up Next on the Equity-Landscape board.
May 27 2022, 2:43 PM · Data-Engineering, Equity-Landscape
ntsako moved T309278: Overall Engagement input metric from Backlog to Up Next on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309273: Readership input metrics from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Data-Engineering, Equity-Landscape
ntsako moved T309274: Editorship Input Metrics from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309276: Grants input metric from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309277: Programs input metric from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309279: Population input metrics from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309282: World Bank Data from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako moved T309283: Wiki DB Map from Backlog to In Development on the Equity-Landscape board.
May 27 2022, 2:43 PM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309279: Population input metrics from Open to In Progress.
May 27 2022, 2:31 PM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309279: Population input metrics, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 27 2022, 2:31 PM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309277: Programs input metric.

Data loaded onto:

May 27 2022, 2:30 PM · Equity-Landscape, Data-Engineering

May 26 2022

ntsako added a comment to T309277: Programs input metric.

CSV loaded onto:

SELECT *
   FROM programs_data
May 26 2022, 3:21 PM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309277: Programs input metric, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 26 2022, 11:15 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309277: Programs input metric from Open to In Progress.
May 26 2022, 11:15 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309276: Grants input metric.

Loaded input metrics on:

May 26 2022, 10:38 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309283: Wiki DB Map, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 26 2022, 9:29 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309283: Wiki DB Map from Open to In Progress.
May 26 2022, 9:29 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309282: World Bank Data from Open to In Progress.
May 26 2022, 9:29 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309282: World Bank Data, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 26 2022, 9:29 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309282: World Bank Data.

Data loaded on

SELECT *
   FROM ntsako.world_bank_data
May 26 2022, 9:29 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309283: Wiki DB Map.

Data loaded on

SELECT *
  FROM ntsako.wiki_db_map
May 26 2022, 9:28 AM · Equity-Landscape, Data-Engineering
ntsako created T309283: Wiki DB Map.
May 26 2022, 9:27 AM · Equity-Landscape, Data-Engineering
ntsako created T309282: World Bank Data.
May 26 2022, 9:26 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309276: Grants input metric from Open to In Progress.
May 26 2022, 9:22 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309276: Grants input metric, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 26 2022, 9:22 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309276: Grants input metric.

Loaded updated grants csv on:

May 26 2022, 9:22 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309274: Editorship Input Metrics from Open to In Progress.
May 26 2022, 9:05 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309274: Editorship Input Metrics, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 26 2022, 9:05 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309274: Editorship Input Metrics.

Editorship input metrics created and stored in:

May 26 2022, 9:05 AM · Equity-Landscape, Data-Engineering
ntsako changed the status of T309273: Readership input metrics from Open to In Progress.
May 26 2022, 9:04 AM · Data-Engineering, Equity-Landscape
ntsako changed the status of T309273: Readership input metrics, a subtask of T306625: Extract + Transformation Raw Data into Input Metrics, from Open to In Progress.
May 26 2022, 9:04 AM · Equity-Landscape, Data-Engineering
ntsako added a comment to T309273: Readership input metrics.
SELECT * 
  FROM ntsako.georeadership_metrics 
 WHERE year=2021
May 26 2022, 9:04 AM · Data-Engineering, Equity-Landscape
ntsako added a comment to T309273: Readership input metrics.

Readership metric created for sample on ntsako.georeadership_metrics

May 26 2022, 9:04 AM · Data-Engineering, Equity-Landscape
ntsako created T309279: Population input metrics.
May 26 2022, 9:02 AM · Equity-Landscape, Data-Engineering
ntsako created T309278: Overall Engagement input metric.
May 26 2022, 9:01 AM · Equity-Landscape, Data-Engineering
ntsako updated the task description for T309277: Programs input metric.
May 26 2022, 9:00 AM · Equity-Landscape, Data-Engineering
ntsako created T309277: Programs input metric.
May 26 2022, 8:59 AM · Equity-Landscape, Data-Engineering
ntsako created T309276: Grants input metric.
May 26 2022, 8:58 AM · Equity-Landscape, Data-Engineering
ntsako created T309275: Affiliates input metric.
May 26 2022, 8:57 AM · Data-Engineering, Equity-Landscape
ntsako created T309274: Editorship Input Metrics.
May 26 2022, 8:57 AM · Equity-Landscape, Data-Engineering
ntsako created T309273: Readership input metrics.
May 26 2022, 8:56 AM · Data-Engineering, Equity-Landscape

May 24 2022

ntsako added a comment to T306620: Grants Metrics Transformation.

Raw grants data loaded under

ntsako.grants
May 24 2022, 5:22 PM · Equity-Landscape, Data-Engineering

May 17 2022

ntsako closed T308453: New HDFS user "gdi" for Equity Landscape as Resolved.
May 17 2022, 1:09 PM · Data-Engineering, Equity-Landscape
ntsako added a comment to T308453: New HDFS user "gdi" for Equity Landscape.

Created database on hive.

May 17 2022, 1:09 PM · Data-Engineering, Equity-Landscape

May 16 2022

ntsako updated the task description for T308453: New HDFS user "gdi" for Equity Landscape.
May 16 2022, 5:35 PM · Data-Engineering, Equity-Landscape
ntsako renamed T308453: New HDFS user "gdi" for Equity Landscape from New databases user "gdi" for Equity Landscape to New HDFS user "gdi" for Equity Landscape.
May 16 2022, 4:01 PM · Data-Engineering, Equity-Landscape
ntsako created T308453: New HDFS user "gdi" for Equity Landscape.
May 16 2022, 3:52 PM · Data-Engineering, Equity-Landscape

Apr 6 2022

ntsako claimed T305475: Milestone: Ingest and Transform Input Data.
Apr 6 2022, 2:48 PM · Equity-Landscape, Data-Engineering
ntsako claimed T305474: Milestone: Transformation Definitions Complete:.
Apr 6 2022, 2:48 PM · Equity-Landscape, Data-Engineering
ntsako claimed T305473: Milestone: Input Data Models Complete..
Apr 6 2022, 2:47 PM · Equity-Landscape, Data-Engineering

Mar 29 2022

ntsako closed T304539: Hosting of GDI use case specific source-code as Resolved.
Mar 29 2022, 4:03 PM · Data Pipelines, Data-Engineering-Kanban, Data-Engineering

Mar 28 2022

ntsako added a comment to T304539: Hosting of GDI use case specific source-code.

https://gitlab.wikimedia.org/repos/data-engineering/gdi-jobs created

Mar 28 2022, 3:04 PM · Data Pipelines, Data-Engineering-Kanban, Data-Engineering
ntsako moved T304539: Hosting of GDI use case specific source-code from Next Up to Done on the Data-Engineering-Kanban board.
Mar 28 2022, 3:03 PM · Data Pipelines, Data-Engineering-Kanban, Data-Engineering
ntsako claimed T304539: Hosting of GDI use case specific source-code.
Mar 28 2022, 3:03 PM · Data Pipelines, Data-Engineering-Kanban, Data-Engineering

Mar 23 2022

ntsako added a project to T304539: Hosting of GDI use case specific source-code: Data Pipelines.
Mar 23 2022, 4:40 PM · Data Pipelines, Data-Engineering-Kanban, Data-Engineering
ntsako created T304539: Hosting of GDI use case specific source-code.
Mar 23 2022, 4:40 PM · Data Pipelines, Data-Engineering-Kanban, Data-Engineering
ntsako added a watcher for Equity-Landscape: ntsako.
Mar 23 2022, 2:24 PM

Mar 17 2022

ntsako closed T300282: Low Risk Oozie Migration: Mediawiki Geoeditors Monthly, a subtask of T299074: Migrate Oozie jobs to Airflow, as Resolved.
Mar 17 2022, 4:07 PM · Epic, Patch-For-Review, Data-Engineering, Data Pipelines
ntsako closed T300282: Low Risk Oozie Migration: Mediawiki Geoeditors Monthly as Resolved.
Mar 17 2022, 4:07 PM · Epic, Data-Engineering-Kanban, Data-Engineering, Data Pipelines
ntsako closed T303405: [Airflow] Create success_file operator as Resolved.
Mar 17 2022, 4:07 PM · Data-Engineering-Kanban, Data Pipelines, Data-Engineering

Mar 15 2022

ntsako moved T303405: [Airflow] Create success_file operator from Ready to Deploy to Done on the Data-Engineering-Kanban board.
Mar 15 2022, 3:03 PM · Data-Engineering-Kanban, Data Pipelines, Data-Engineering
ntsako moved T303405: [Airflow] Create success_file operator from In Code Review to Ready to Deploy on the Data-Engineering-Kanban board.
Mar 15 2022, 3:03 PM · Data-Engineering-Kanban, Data Pipelines, Data-Engineering