Page MenuHomePhabricator

tchin (Thomas)
Software Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Jun 21 2021, 2:34 PM (172 w, 18 h)
Availability
Available
LDAP User
TChin
MediaWiki User
TChin (WMF) [ Global Accounts ]

Recent Activity

Today

tchin updated the task description for T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.
Tue, Oct 8, 5:05 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin added a comment to T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.

Oh jeez there's also @wikimedia/url-get that I just found

Tue, Oct 8, 4:26 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin added a comment to T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.

node-rdkafka-prometheus is now on gitlab. KafkaSSE looks like it's going to be a bit more difficult since the tests require a full kafka setup and I can't seem to even run it locally... although I guess I can skip getting CI to work for it for now

Tue, Oct 8, 4:22 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin updated the task description for T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.
Tue, Oct 8, 4:21 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin added a comment to T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.

Also added node-rdkafka-prometheus to the list

Tue, Oct 8, 3:35 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin updated the task description for T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.
Tue, Oct 8, 3:32 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)

Yesterday

tchin added a comment to T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.

Seems like we missed KafkaSSE during the GitLab migration

Mon, Oct 7, 12:13 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin updated the task description for T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.
Mon, Oct 7, 12:12 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
tchin closed T366537: Create gitlab ci npm publish pipeline and job in workflow_utils gitlab_ci_templates as Resolved.
Mon, Oct 7, 12:09 AM · Data-Engineering (Q1 2024 July 1st - September 30th)
tchin closed T366537: Create gitlab ci npm publish pipeline and job in workflow_utils gitlab_ci_templates, a subtask of T366611: Migrate Data Engineering NodeJS library repos to GitLab, as Resolved.
Mon, Oct 7, 12:09 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Event-Platform

Sat, Oct 5

tchin updated the task description for T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.
Sat, Oct 5, 11:50 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)

Wed, Oct 2

tchin added a comment to T374118: Datahub - ingest Hive discovery database.

I did a manual ingestion and was able to see the tables on datahub if I access it directly through a url
https://datahub.wikimedia.org/dataset/urn:li:dataset:(urn:li:dataPlatform:hive,discovery.cirrus_index,PROD)/Schema?is_lineage_mode=false&schemaFilter=

Wed, Oct 2, 12:27 AM · Discovery-Search (Current work), Data-Engineering

Tue, Oct 1

tchin added a comment to T374118: Datahub - ingest Hive discovery database.

We might need to simply ingest all the tables

I can probably take a look at why the table match isn’t working, next thing we could try is providing a custom transfom function

Tue, Oct 1, 3:12 AM · Discovery-Search (Current work), Data-Engineering

Mon, Sep 16

tchin moved T342911: Data Quality Issue: Wikitext History Job fail / rerun in Airflow from Blocked/Paused to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Mon, Sep 16, 2:19 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Movement-Metrics, Movement-Insights

Thu, Sep 12

tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

I don't know how/where spark's appName is autogenerated, but for dags to use spark lineage we should make it required for them to also define a static appName or else there will be a new pipeline + task(s) for every dag run

Screenshot 2024-09-12 at 8.16.44 AM.png (714×1 px, 87 KB)

Thu, Sep 12, 12:21 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines

Sun, Sep 8

tchin added a comment to T361769: Migrate and re-deploy eventstreams using service-utils.
  1. Eventstreams is currently deployed in the beta cluster successfully with service-utils
  2. It turns out that KafkaSSE uses bunyan, so logging is a bit weird since now it uses 2 formats
  3. Beta logstash does not seem to capture logging from stdout, so it only shows the logs from KafkaSSE. However, the eventstreams logs do exist when looking inside the docker container
Sun, Sep 8, 6:50 PM · Data-Engineering (Q1 2024 July 1st - September 30th)

Aug 29 2024

tchin moved T366562: [Event Platform] - Add schema CI test that array ensures properties with object types also enumerate object properties from If we have time to Done on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 29 2024, 8:42 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform

Aug 24 2024

tchin added a comment to T365633: Toolforge Aptfile not producing working copy of `ffmpeg`.

@tchin can you open a new task with the code/packages that you are seeing issues with?

Sure here it is: T373251

Aug 24 2024, 8:18 PM · Toolforge (Toolforge iteration 14)
tchin updated the task description for T373251: Toolforge job fails to find library installed via aptfile.
Aug 24 2024, 8:16 PM · Toolforge
tchin created T373251: Toolforge job fails to find library installed via aptfile.
Aug 24 2024, 8:15 PM · Toolforge

Aug 21 2024

brouberol awarded T372899: Ingest a test hive database into datahub a Yellow Medal token.
Aug 21 2024, 2:33 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin added a comment to T372899: Ingest a test hive database into datahub.

I created the database sandbox and sandbox_iceberg and also created an interlanguage_navigation table in sandbox:

Aug 21 2024, 4:43 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines

Aug 20 2024

tchin added a comment to T372899: Ingest a test hive database into datahub.

What should the databases be called and where should it live? Should we have seperate databases for hive and iceberg tables?
/wmf/data/wmf_test and /wmf/data/wmf_test_iceberg?

Aug 20 2024, 5:31 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

Yeah I think we should prioritize that.

Aug 20 2024, 2:43 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

It seems like right now, unless we upgrade to at least Spark 3.4 and Iceberg 1.4, we will not be able to use Datahub's spark lineage connector on iceberg tables

Aug 20 2024, 3:56 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

I ran a job using our regular prod configs just without iceberg tables. It ran successfully and outputted this:

Aug 20 2024, 3:52 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines

Aug 19 2024

tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

Update: Tried using spark 3.3.2 with this:

Aug 19 2024, 10:33 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines

Aug 15 2024

tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

I'm using the newer acryl-spark-lineage which works for datahub 0.13.3 https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage

Aug 15 2024, 1:20 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

I can see that Iceberg for Spark 3.1 does not in fact have an icebergCatalog method but for > Spark 3.3 it does. Going to see if I can use the Spark 3.3 configs from the airflow dags repo

Aug 15 2024, 1:01 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin added a comment to T306896: Spike: Integrate Spark with DataHub with lineage.

I ran a simple spark sql job on a statbox with:

sudo -u analytics-privatedata spark3-sql --jars ./acryl-spark-lineage-0.2.16.jar --conf "spark.extraListeners=datahub.spark.DatahubSparkListener" --master local[12] --driver-memory 8G --conf "spark.datahub.emitter=file" --conf "spark.datahub.file.filename=./il_lineage" -f il_test.hql
Aug 15 2024, 12:33 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines

Aug 14 2024

tchin added a comment to T365633: Toolforge Aptfile not producing working copy of `ffmpeg`.

Hmm I guess my problem is different and just lies in the non-standard way packages are installed.

heroku@38b1190a0d63:/workspace$ find /layers/fagiani_apt/apt | grep libGLESv2
/layers/fagiani_apt/apt/usr/lib/x86_64-linux-gnu/libGLESv2.so.2.1.0
/layers/fagiani_apt/apt/usr/lib/x86_64-linux-gnu/libGLESv2.so.2
heroku@38b1190a0d63:/workspace$ npm run scrape
browserType.launch: 
╔══════════════════════════════════════════════════════╗
║ Host system is missing dependencies to run browsers. ║
║ Missing libraries:                                   ║
║     libGLESv2.so.2                                   ║
╚══════════════════════════════════════════════════════╝
Aug 14 2024, 6:50 PM · Toolforge (Toolforge iteration 14)

Aug 12 2024

tchin updated the task description for T306896: Spike: Integrate Spark with DataHub with lineage.
Aug 12 2024, 6:45 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines

Aug 1 2024

tchin moved T306896: Spike: Integrate Spark with DataHub with lineage from Next Up to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 1 2024, 4:18 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Data-Catalog, Data Pipelines
tchin moved T367403: Validate CI integration so that Ci can release Maven artifacts on user's demand from In progress to Blocked/Paused on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 1 2024, 4:14 PM · Discovery-Search (Current work), Release-Engineering-Team (Radar), Data-Engineering (Q1 2024 July 1st - September 30th), Java-Scala-Standardization, Data-Platform-SRE
tchin moved T360922: [Status Store] [SPIKE] Investigate and document approach for Iceberg Sensors from In Review to Blocked/Paused on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 1 2024, 4:12 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Spike
tchin added a comment to T365633: Toolforge Aptfile not producing working copy of `ffmpeg`.

I think I'm experiencing a similar error. Suddenly started getting this on a scraping job I have:

2024-08-01T02:42:58+00:00 [test-scrape-pd5kq] â Host system is missing dependencies to run browsers. â
2024-08-01T02:42:58+00:00 [test-scrape-pd5kq] â Missing libraries:                                   â
2024-08-01T02:42:58+00:00 [test-scrape-pd5kq] â     libOSSlib.so                                     â
2024-08-01T02:42:58+00:00 [test-scrape-pd5kq] ââââââââââââââââââ
Aug 1 2024, 5:38 AM · Toolforge (Toolforge iteration 14)

Jul 29 2024

tchin edited projects for T371120: service-utils helper for trace header propagation, added: Data-Engineering (Q1 2024 July 1st - September 30th); removed Data-Engineering.
Jul 29 2024, 3:06 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Observability-Tracing

Jul 26 2024

Ottomata awarded T371120: service-utils helper for trace header propagation a Like token.
Jul 26 2024, 6:25 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Observability-Tracing

Jul 25 2024

tchin added a parent task for T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils: T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils.
Jul 25 2024, 4:36 PM · SecTeam-Processed, secscrum, Security, Application Security Reviews
tchin added a subtask for T360924: Replace service runner with a simplified library to better support metrics and debugging: service-utils: T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils.
Jul 25 2024, 4:36 PM · Data-Engineering (Q1 2024 July 1st - September 30th)
tchin claimed T366562: [Event Platform] - Add schema CI test that array ensures properties with object types also enumerate object properties.
Jul 25 2024, 1:29 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
tchin claimed T366487: Event Platform schemas should not support type changes to structs as array element or map value types.
Jul 25 2024, 1:28 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform

Jul 22 2024

tchin added a comment to T367134: [Refine Refactoring] Changes to EventStreamConfig needed for scheduling Refine via airflow.

Diffing the output of deeply merging stream defaults, all of the changes are either just adding job_name to a disabled hadoop ingestion config and adding analytics_hive_ingestion defaults to streams that have enabled hadoop ingestion

Jul 22 2024, 5:13 PM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Engineering (Q1 2024 July 1st - September 30th)

Jul 18 2024

tchin renamed T361769: Migrate and re-deploy eventstreams using service-utils from Migrate and re-deploy eventstreams using new service runner to Migrate and re-deploy eventstreams using service-utils.
Jul 18 2024, 4:39 PM · Data-Engineering (Q1 2024 July 1st - September 30th)

Jul 6 2024

tchin updated subscribers of T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils.
Jul 6 2024, 3:17 AM · SecTeam-Processed, secscrum, Security, Application Security Reviews

Jun 14 2024

tchin added a comment to T309229: Enforce Cassandra client encryption (AQS cluster).

I don't think so. The image suggestion work on Flink never progressed passed the original ticket.

Jun 14 2024, 11:20 PM · Data-Engineering-Radar, Cassandra

Jun 11 2024

tchin added a comment to T365512: Archive the service-scaffold-node and service-scaffold-golang libraries.

Getting rid of service-scaffold-node also means we should get rid of servicelib-node since it was created for service-scaffold node (this is the reason why service-scaffold node depends on packages that don't exist. This project never finished)

Jun 11 2024, 4:46 PM · Language-Team, service-template-node, MW-Interfaces-Team, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup

Jun 3 2024

tchin added a comment to T344730: Migrate Data Engineering Pipelinelib repos to GitLab.

Added jsonschema-tools to list, as it is similar to node-rdkafka-factory and eventgate.

Jun 3 2024, 2:59 PM · Patch-For-Review, Data-Engineering (Q4 2024 April 1st - June 30th), GitLab (Pipeline Services Migration🐤), Event-Platform

May 29 2024

tchin added a comment to T344730: Migrate Data Engineering Pipelinelib repos to GitLab.

Don't forget that any CI that has a production deployment pipeline needs the repo to be added to trusted runners and also have their tags protected (Slack thread on protecting tags)

May 29 2024, 4:17 PM · Patch-For-Review, Data-Engineering (Q4 2024 April 1st - June 30th), GitLab (Pipeline Services Migration🐤), Event-Platform
tchin added a comment to T344730: Migrate Data Engineering Pipelinelib repos to GitLab.

Yes, that's actually what I do for service-utils

Very cool!

@tchin should we make that a reusable gitlab_ci template job in workflow_utils?

May 29 2024, 4:14 PM · Patch-For-Review, Data-Engineering (Q4 2024 April 1st - June 30th), GitLab (Pipeline Services Migration🐤), Event-Platform
tchin updated subscribers of T365829: Fix DPE alerts dashboard to work with Google Groups.

This might be harder than I thought. Creating a dummy google account to act as the receiver seems off the table. All of Google's APIs require OAuth or some manual way for the user to sign in. There is no way to make a pure bot account, and also no good way to automate login without being slapped by a ban.

May 29 2024, 3:27 AM · Data-Engineering

May 28 2024

tchin added a comment to T344730: Migrate Data Engineering Pipelinelib repos to GitLab.

Can we publish and depend on npm pacakges from gitlab, like we do for python wheels?

May 28 2024, 2:59 PM · Patch-For-Review, Data-Engineering (Q4 2024 April 1st - June 30th), GitLab (Pipeline Services Migration🐤), Event-Platform

May 21 2024

tchin added a comment to T365512: Archive the service-scaffold-node and service-scaffold-golang libraries.

Sounds good to me. service-scaffold-node was started to turn service-template-node into a group of libraries and is basically superseded by my effort to replace service-runner (T360924) which is mostly completed

May 21 2024, 8:16 PM · Language-Team, service-template-node, MW-Interfaces-Team, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup

May 15 2024

tchin added a comment to T344730: Migrate Data Engineering Pipelinelib repos to GitLab.

Would be nice to get a confirmation for archiving node-rdkafka-statsd since it'll progress T349118

May 15 2024, 1:54 PM · Patch-For-Review, Data-Engineering (Q4 2024 April 1st - June 30th), GitLab (Pipeline Services Migration🐤), Event-Platform

May 13 2024

tchin added a comment to T347498: eventgate: eventstreams: services should use common logging schema.

@tchin, if we are able to get off of old service-runner, would your new framework take care of this?

May 13 2024, 7:08 PM · Data-Engineering, EventStreams, Event-Platform

Apr 17 2024

tchin added a comment to T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils.

Has this project been discussed across the WMF/Community?

It would be great if there was a RFC process, but there has at least been discussions about what to do with service-runner and this project is on the radar to the entirety of Data Platform Engineering and some people on the MW engineering team and the language team. It was also posted on slack on #engineering-all to give people a head's up just in case there was another team working on something similar. If there's one thing I'm sure about is that the consensus is that we need a replacement, whether or not this is it.

Apr 17 2024, 3:33 PM · SecTeam-Processed, secscrum, Security, Application Security Reviews
tchin updated the task description for T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils.
Apr 17 2024, 1:12 PM · SecTeam-Processed, secscrum, Security, Application Security Reviews
tchin renamed T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils from Application Security Review Request : service-runner replacement to Application Security Review Request : service-runner replacement: @tchin/service-utils.
Apr 17 2024, 1:10 PM · SecTeam-Processed, secscrum, Security, Application Security Reviews
tchin created T362774: Application Security Review Request : service-runner replacement: @tchin/service-utils.
Apr 17 2024, 1:09 PM · SecTeam-Processed, secscrum, Security, Application Security Reviews

Apr 5 2024

tchin added a comment to T357468: [Dataset Config Store] Setup initial CI checks.

Config store repo does CI checks for jsonschema correctness and config values against its jsonschema. The Datasets Config service repo has dockerized CI using Kokkuri and Blubber.

Apr 5 2024, 4:21 PM · Data-Engineering

Mar 26 2024

tchin claimed T357434: [Dataset Config Store] Deploy poc to dse-k8s.
Mar 26 2024, 10:13 AM · Data-Engineering (Q4 2024 April 1st - June 30th)

Feb 27 2024

tchin added a comment to T350180: Upgrade prom-client in NodeJS service-runner and enable collectDefaultMetrics.

If it's to a point where we even need to use a new name, might as well break everything. I'd love to join in on the fun

Feb 27 2024, 2:09 PM · Data-Engineering, observability, ChangeProp, Event-Platform, service-runner

Feb 11 2024

tchin moved T357005: eventstreams regularly uses more than 95% of its memory limit from Next Up to Radar (External Teams) on the Data-Engineering (Sprint 8) board.
Feb 11 2024, 3:04 AM · Data-Engineering, Event-Platform, EventStreams, serviceops, Prod-Kubernetes, Kubernetes
tchin edited projects for T357005: eventstreams regularly uses more than 95% of its memory limit, added: Data-Engineering (Sprint 8); removed Data-Engineering.
Feb 11 2024, 3:03 AM · Data-Engineering, Event-Platform, EventStreams, serviceops, Prod-Kubernetes, Kubernetes
tchin added a comment to T357005: eventstreams regularly uses more than 95% of its memory limit.

Looking at the logs, this seems to coincide with the redaction patch to eventstreams, but looking at the code I'm having a hard time finding where a memory leak could've happened... more confusing that it's just 1 or 2 pods hitting the limit

Feb 11 2024, 3:01 AM · Data-Engineering, Event-Platform, EventStreams, serviceops, Prod-Kubernetes, Kubernetes

Jan 30 2024

tchin moved T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg from Blocked/Paused to Ready to Deploy on the Data-Engineering (Sprint 8) board.
Jan 30 2024, 2:17 PM · Data-Engineering (Sprint 8)

Jan 22 2024

tchin added a comment to T352671: [Iceberg Migration] Migrate interlanguage tables to Iceberg.

Using lz4 compression works but checking it with parquet-tools doesn't. I see something like compression: UNKNOWN (space_saved: -25%) Seems like a known issue.

Jan 22 2024, 1:51 PM · Data-Engineering (Sprint 7), Patch-For-Review

Jan 5 2024

tchin added a comment to T352671: [Iceberg Migration] Migrate interlanguage tables to Iceberg.

INSERT OVERRIDE with PARTITION also doesn't work anymore because Iceberg uses hidden partitioning so had to enable Spark's dynamic overwrite
https://iceberg.apache.org/docs/latest/spark-writes/#insert-overwrite

Jan 5 2024, 6:32 PM · Data-Engineering (Sprint 7), Patch-For-Review
tchin added a comment to T352671: [Iceberg Migration] Migrate interlanguage tables to Iceberg.

TIL when setting the compression codec to snappy, Iceberg doesn't end the files in hdfs with .snappy.parquet. I had to check if the format was correct using parquet-tools.

Jan 5 2024, 6:23 PM · Data-Engineering (Sprint 7), Patch-For-Review
tchin moved T352671: [Iceberg Migration] Migrate interlanguage tables to Iceberg from Next Up to In progress on the Data-Engineering (Sprint 6) board.
Jan 5 2024, 5:58 PM · Data-Engineering (Sprint 7), Patch-For-Review
tchin claimed T352671: [Iceberg Migration] Migrate interlanguage tables to Iceberg.
Jan 5 2024, 5:58 PM · Data-Engineering (Sprint 7), Patch-For-Review

Dec 19 2023

tchin added a comment to T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg.

Tested to see if the COALESCE hints still work in Iceberg by creating 2 tables and filling then with/without the hint. It still seems to work.

Dec 19 2023, 7:30 AM · Data-Engineering (Sprint 8)

Dec 18 2023

tchin awarded T336739: Post Oozie -> Airflow migration refactorings a Barnstar token.
Dec 18 2023, 3:12 PM · Patch-For-Review, Data-Engineering, Epic, Data Pipelines
tchin moved T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg from Next Up to In progress on the Data-Engineering (Sprint 6) board.
Dec 18 2023, 2:53 PM · Data-Engineering (Sprint 8)

Dec 16 2023

tchin added a comment to T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg.

Tested on a stat machine with

CREATE EXTERNAL TABLE IF NOT EXISTS `aqs_hourly`(  
    `cache_status`      string     COMMENT 'Cache status',  
    `http_status`       string     COMMENT 'HTTP status of response',  
    `http_method`       string     COMMENT 'HTTP method of request',  
    `response_size`     bigint     COMMENT 'Response size',  
    `uri_host`          string     COMMENT 'Host of request',  
    `uri_path`          string     COMMENT 'Path of request',  
    `request_count`     bigint     COMMENT 'Number of requests',  
    `hour`              timestamp  COMMENT 'The aggregated hour. Covers from minute 00 to 59'  
)  
USING ICEBERG
PARTITIONED BY (days(hour))
;

And

spark3-sql --master yarn --executor-memory 8G --executor-cores 4 --driver-memory 2G --conf spark.dynamicAllocation.maxExecutors=64 \
-f aqs_hourly_iceberg.hql  \
-d source_table=wmf.webrequest \
-d webrequest_source=text \
-d destination_table=tchin.aqs_hourly \
-d coalesce_partitions=1 \
-d year=2023 \
-d month=12 \
-d day=3 \
-d hour=0
Dec 16 2023, 4:23 AM · Data-Engineering (Sprint 8)

Dec 14 2023

tchin changed the status of T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg, a subtask of T333013: [Iceberg Migration] Apache Iceberg Migration, from Open to In Progress.
Dec 14 2023, 6:16 AM · Data-Engineering, Epic
tchin changed the status of T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg from Open to In Progress.
Dec 14 2023, 6:16 AM · Data-Engineering (Sprint 8)

Dec 11 2023

tchin awarded T311866: Migrate Database::select usages to SelectQueryBuilder (in WMF-deployed extensions) a Barnstar token.
Dec 11 2023, 3:10 PM · MW-1.41-notes (1.41.0-wmf.25; 2023-09-05), MW-1.40-notes (1.40.0-wmf.26; 2023-03-06), MW-1.39-notes (1.39.0-wmf.26; 2022-08-22), Data-Persistence (work done), Platform Engineering
tchin claimed T352669: [Iceberg Migration] Migrate aqs hourly tables to Iceberg.
Dec 11 2023, 2:45 PM · Data-Engineering (Sprint 8)

Dec 2 2023

tchin awarded T347347: Make "Quick" MW install a thing a Love token.
Dec 2 2023, 11:21 PM · MW-1.42-notes (1.42.0-wmf.12; 2024-01-02), User-zeljkofilipin, MediaWiki-Platform-Team, MediaWiki-Documentation

Nov 14 2023

tchin added a comment to T351092: [harbor,docs] Improve Harbor quota handling and docs.

I think the per-image quota should probably be increased. I tested building a few projects locally and a project with NodeJS and 0 dependencies results in a built image that's 805.58 MB. One with only VueJS as a dependency bumps it up to 858.13 MB. I'm probably not going to be the last one who needs more than 200 MB of working space :/

Nov 14 2023, 4:40 AM · Toolforge (Toolforge iteration 15), Documentation

Nov 13 2023

tchin added a comment to T351092: [harbor,docs] Improve Harbor quota handling and docs.

Example error:

step-export: 2023-11-13T05:41:56.835942824Z ERROR: failed to export: failed to write image to the following tags: [tools-harbor.wmcloud.org/tool-dpe-alerts-dashboard/tool-dpe-alerts-dashboard:latest: PATCH https://tools-harbor.wmcloud.org/v2/tool-dpe-alerts-dashboard/tool-dpe-alerts-dashboard/blobs/uploads/b62dd944-4fad-4ee8-b900-8409f7860d6c?_state=REDACTED: unexpected status code 413 Request Entity Too Large: <html>
step-export: 2023-11-13T05:41:56.835973012Z <head><title>413 Request Entity Too Large</title></head>
step-export: 2023-11-13T05:41:56.835976984Z <body>
step-export: 2023-11-13T05:41:56.835979969Z <center><h1>413 Request Entity Too Large</h1></center>
step-export: 2023-11-13T05:41:56.835983468Z <hr><center>nginx/1.18.0</center>
step-export: 2023-11-13T05:41:56.836002364Z </body>
step-export: 2023-11-13T05:41:56.836005027Z </html>
step-export: 2023-11-13T05:41:56.836008032Z ]
step-export: 
step-results: 2023-11-13T05:41:57.433667715Z 2023/11/13 05:41:57 Skipping step because a previous step failed
Nov 13 2023, 2:57 PM · Toolforge (Toolforge iteration 15), Documentation

Oct 26 2023

tchin added a comment to T347706: [Data Quality] [SPIKE] Document Current Logging, Monitoring and Data Quality Checks for Unique Devices.

Current version of the writeup is here

Oct 26 2023, 3:53 PM · Data Engineering and Event Platform Team (Sprint 4)

Oct 11 2023

tchin added a comment to T345389: [SPIKE] Should we introduce static typing to Event Platform nodejs codebases?.

If we do introduce something, we should use JSDoc3 and follow what's happening on this ticket T138401

Oct 11 2023, 2:28 PM · Event-Platform, Data-Engineering

Oct 3 2023

tchin moved T347706: [Data Quality] [SPIKE] Document Current Logging, Monitoring and Data Quality Checks for Unique Devices from Next Up to In progress on the Data Engineering and Event Platform Team (Sprint 3) board.
Oct 3 2023, 5:39 PM · Data Engineering and Event Platform Team (Sprint 4)

Sep 29 2023

tchin added a comment to T347676: Partition reassignment on kafka-jumbo negatively impacting mw-page-content-change-enrich.

DeliveryGuarantee.AT_LEAST_ONCE: The sink will wait for all outstanding records in the Kafka buffers to be acknowledged by the Kafka producer on a checkpoint. No messages will be lost in case of any issue with the Kafka brokers but messages may be duplicated when Flink restarts because Flink reprocesses old input records.

https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/datastream/kafka/#fault-tolerance

Sep 29 2023, 12:11 PM · Event-Platform, Data-Engineering, Data Engineering and Event Platform Team, Data-Platform-SRE
tchin merged T347615: mw-page-content-change-enrich not checkpointing into T347676: Partition reassignment on kafka-jumbo negatively impacting mw-page-content-change-enrich.
Sep 29 2023, 11:57 AM · Event-Platform, Data-Engineering, Data Engineering and Event Platform Team, Data-Platform-SRE
tchin merged task T347615: mw-page-content-change-enrich not checkpointing into T347676: Partition reassignment on kafka-jumbo negatively impacting mw-page-content-change-enrich.
Sep 29 2023, 11:57 AM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform

Sep 28 2023

tchin added a comment to T347615: mw-page-content-change-enrich not checkpointing.

Unaligned checkpoints didn't work. Maybe it's because of data being moved around to new brokers and Kafka is too overloaded.

Sep 28 2023, 6:04 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform
tchin updated subscribers of T347615: mw-page-content-change-enrich not checkpointing.
Sep 28 2023, 6:00 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform
tchin moved T347615: mw-page-content-change-enrich not checkpointing from Data Eng Backlog to Sprint 2 on the Data Engineering and Event Platform Team board.
Sep 28 2023, 5:59 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform
tchin moved T347615: mw-page-content-change-enrich not checkpointing from Next Up to In progress on the Data Engineering and Event Platform Team (Sprint 2) board.
Sep 28 2023, 5:59 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform
tchin renamed T347615: mw-page-content-change-enrich not checkpointing from mw-page-content-change-enrich not checkpoint to mw-page-content-change-enrich not checkpointing.
Sep 28 2023, 5:27 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform
tchin created T347615: mw-page-content-change-enrich not checkpointing.
Sep 28 2023, 5:26 PM · Data Engineering and Event Platform Team (Sprint 2), Data-Engineering, Event-Platform
tchin added a comment to T347521: Troubleshoot mw-page-content-change-enrich and flink-operator.

@bking Gabriele is currently on sick leave but yes let's try incrementing the helm chart version

Sep 28 2023, 1:29 PM · Data-Platform-SRE

Sep 19 2023

tchin placed T287405: Refactor ILocalizedException to be DI-friendly. up for grabs.
Sep 19 2023, 6:59 AM · MW-1.43-notes (1.43.0-wmf.20; 2024-08-27), MediaWiki-General, MW-1.41-notes (1.41.0-wmf.30; 2023-10-10), Patch-For-Review, User-thiemowmde, WMDE-TechWish-Maintenance, Move-Files-To-Commons, MW-1.37-notes (1.37.0-wmf.23; 2021-09-13), Dependency injection, User-DannyS712, Platform Team Workboards (MW Expedition)
tchin placed T291009: LoadExtensionSchemaUpdates hook needs to have access to Config up for grabs.
Sep 19 2023, 6:58 AM · MediaWiki-Core-Hooks, Platform Team Workboards (MW Expedition)

Aug 31 2023

tchin added a comment to T344511: Enum with an entry of `null` should fail jsonschema-tools validation.

Associated GitHub PR: https://github.com/wikimedia/jsonschema-tools/pull/48

Aug 31 2023, 5:21 PM · Data Engineering and Event Platform Team (Sprint 2), Event-Platform, Data-Engineering, Patch-For-Review

Aug 29 2023

tchin moved T344511: Enum with an entry of `null` should fail jsonschema-tools validation from Next Up to In progress on the Data Engineering and Event Platform Team (Sprint 1) board.
Aug 29 2023, 5:34 PM · Data Engineering and Event Platform Team (Sprint 2), Event-Platform, Data-Engineering, Patch-For-Review
tchin edited projects for T344511: Enum with an entry of `null` should fail jsonschema-tools validation, added: Data Engineering and Event Platform Team (Sprint 1); removed Data Engineering and Event Platform Team.
Aug 29 2023, 5:34 PM · Data Engineering and Event Platform Team (Sprint 2), Event-Platform, Data-Engineering, Patch-For-Review