Page MenuHomePhabricator

tchin (Thomas)
Software Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Jun 21 2021, 2:34 PM (101 w, 1 d)
Availability
Available
LDAP User
TChin
MediaWiki User
TChin (WMF) [ Global Accounts ]

Recent Activity

Thu, May 25

tchin moved T337395: Remove user is_registered field from mediawiki/page/change schema from Next Up to In progress on the Event-Platform Value Stream (Sprint 14 A) board.
Thu, May 25, 1:11 PM · Event-Platform Value Stream (Sprint 14 A), Patch-For-Review, Data-Engineering
tchin edited projects for T337395: Remove user is_registered field from mediawiki/page/change schema, added: Event-Platform Value Stream (Sprint 14 A); removed Event-Platform Value Stream.
Thu, May 25, 1:09 PM · Event-Platform Value Stream (Sprint 14 A), Patch-For-Review, Data-Engineering

Wed, May 24

tchin added a subtask for T328013: Improve mediawiki-event-enrichment test suite: T337400: Get coverage artifacts from Kokkuri.
Wed, May 24, 2:30 PM · Patch-For-Review, Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning
tchin added a parent task for T337400: Get coverage artifacts from Kokkuri: T328013: Improve mediawiki-event-enrichment test suite.
Wed, May 24, 2:30 PM · Data-Engineering, Event-Platform Value Stream
tchin created T337400: Get coverage artifacts from Kokkuri.
Wed, May 24, 2:29 PM · Data-Engineering, Event-Platform Value Stream
tchin claimed T337395: Remove user is_registered field from mediawiki/page/change schema.
Wed, May 24, 1:39 PM · Event-Platform Value Stream (Sprint 14 A), Patch-For-Review, Data-Engineering

Mon, May 8

tchin moved T328013: Improve mediawiki-event-enrichment test suite from Next Up to In progress on the Event-Platform Value Stream (Sprint 12) board.
Mon, May 8, 1:00 PM · Patch-For-Review, Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning
tchin claimed T328013: Improve mediawiki-event-enrichment test suite.
Mon, May 8, 1:00 PM · Patch-For-Review, Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning

Sun, May 7

tchin added a comment to T328013: Improve mediawiki-event-enrichment test suite.

Oof, was looking at how to potentially mock the http session and response object, but turns out mocks don't work when pickled/multiprocessed. I guess the only option is to spin up a web server during testing and hit that instead

Sun, May 7, 9:08 PM · Patch-For-Review, Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning

Fri, May 5

tchin added a comment to T335802: eventutilities-python manager should set up python logging with ECS format.

Do we know what's turning them into ecs format in the first place?

Fri, May 5, 6:53 AM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering

Thu, May 4

tchin moved T335802: eventutilities-python manager should set up python logging with ECS format from Next Up to In progress on the Event-Platform Value Stream (Sprint 12) board.
Thu, May 4, 1:28 PM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering
tchin moved T335802: eventutilities-python manager should set up python logging with ECS format from Backlog to Sprint 12 on the Event-Platform Value Stream board.
Thu, May 4, 1:18 PM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering
tchin claimed T335802: eventutilities-python manager should set up python logging with ECS format.
Thu, May 4, 1:13 PM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering

Tue, May 2

tchin moved T324980: Event Driven Enrichment Pipelines repositories should be generated from a template from In progress to In Review on the Event-Platform Value Stream (Sprint 12) board.
Tue, May 2, 1:03 PM · Event-Platform Value Stream (Sprint 12), Data-Engineering-Planning

Apr 19 2023

tchin moved T327251: Q4 eventutilities-python should bundle java deps. from In Review to Done on the Event-Platform Value Stream (Sprint 11) board.
Apr 19 2023, 12:45 PM · Event-Platform Value Stream (Sprint 11), Data-Engineering-Planning

Apr 18 2023

tchin moved T327251: Q4 eventutilities-python should bundle java deps. from In Progress to In Review on the Event-Platform Value Stream (Sprint 11) board.
Apr 18 2023, 7:02 AM · Event-Platform Value Stream (Sprint 11), Data-Engineering-Planning

Apr 17 2023

tchin added a comment to T327251: Q4 eventutilities-python should bundle java deps..

Don't know how to connect gitlab merge requests to phab but here's the link for posterity's sake:
Bundle Java jars when building wheel

Apr 17 2023, 1:09 PM · Event-Platform Value Stream (Sprint 11), Data-Engineering-Planning

Apr 10 2023

tchin added a comment to T327251: Q4 eventutilities-python should bundle java deps..

A less opaque place for inspiration is that the py4j library bundles the jar with its python wheel as well. They leave adding it to the classpath to the user though. I actually don't see how we'd include the jars in the classpath without injecting them at runtime. Does something in pyflink automatically find it?

Apr 10 2023, 7:59 PM · Event-Platform Value Stream (Sprint 11), Data-Engineering-Planning

Apr 3 2023

tchin claimed T333795: Event Catalog: Standardize Options Handling.
Apr 3 2023, 3:20 PM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning
tchin updated subscribers of T333795: Event Catalog: Standardize Options Handling.
Apr 3 2023, 6:45 AM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning
tchin created T333795: Event Catalog: Standardize Options Handling.
Apr 3 2023, 6:44 AM · Event-Platform Value Stream (Sprint 14 A), Data-Engineering-Planning

Mar 29 2023

tchin claimed T331542: EventStreamCatalog should not remove user specified options in CREATE TABLE statements.
Mar 29 2023, 1:05 PM · Data-Engineering, Event-Platform Value Stream

Mar 22 2023

tchin moved T330441: Flink EventStreamCatalog should add watermark from In Review to Done on the Event-Platform Value Stream (Sprint 10) board.
Mar 22 2023, 2:07 PM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning

Mar 16 2023

tchin moved T330703: Flink EventStreamCatalog should not prevent creation of VIEWs from In Review to Done on the Event-Platform Value Stream (Sprint 10) board.
Mar 16 2023, 9:21 PM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning
tchin moved T330703: Flink EventStreamCatalog should not prevent creation of VIEWs from Backlog to Sprint 10 on the Event-Platform Value Stream board.
Mar 16 2023, 2:05 PM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning
tchin moved T330703: Flink EventStreamCatalog should not prevent creation of VIEWs from Next Up to In Progress on the Event-Platform Value Stream (Sprint 10) board.
Mar 16 2023, 2:05 PM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning

Mar 15 2023

tchin added a comment to T330441: Flink EventStreamCatalog should add watermark.

I don't think I'm fully understand what the options are for. You can set watermarks for tables in the catalog by doing something like

CREATE TABLE with_watermark (
	event_time AS meta['dt'],
	WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
) LIKE some_table;

or with kafka metadata

CREATE TABLE with_kafka_timestamp (
	event_time TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL,
	WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
) LIKE some_table;
Mar 15 2023, 6:44 AM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning
tchin moved T330769: EventStreamCatalog removes 'topic' table option if connector = upsert-kafka from In Review to Done on the Event-Platform Value Stream (Sprint 10) board.
Mar 15 2023, 6:04 AM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning

Mar 14 2023

tchin moved T330441: Flink EventStreamCatalog should add watermark from Next Up to In Progress on the Event-Platform Value Stream (Sprint 10) board.
Mar 14 2023, 2:09 PM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning
tchin claimed T330441: Flink EventStreamCatalog should add watermark.
Mar 14 2023, 2:15 AM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning
tchin moved T330769: EventStreamCatalog removes 'topic' table option if connector = upsert-kafka from Next Up to In Review on the Event-Platform Value Stream (Sprint 10) board.
Mar 14 2023, 2:14 AM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning
tchin claimed T330769: EventStreamCatalog removes 'topic' table option if connector = upsert-kafka.
Mar 14 2023, 1:34 AM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning

Mar 7 2023

tchin claimed T330703: Flink EventStreamCatalog should not prevent creation of VIEWs.
Mar 7 2023, 1:57 PM · Event-Platform Value Stream (Sprint 10), Data-Engineering-Planning

Feb 13 2023

tchin added a comment to T329524: Refactor Image Suggestions Feedback > Cassandra Flink Job and Deploy to DSE k8s.

Looking back at the code, it seems like the only thing that needs to be moved (moved is a generous term) is the cassandra sink. Should probably just think about implementing a cassandra sink builder like the kafka builder in event utilities. The code to make the source is also now covered by event utilities eventDataStreamFactory.kafkaSourceBuilder so that can be tossed completely. The rest of the stuff is specific to the pipeline.

Feb 13 2023, 2:55 PM · Data-Engineering-Planning, Event-Platform Value Stream

Feb 6 2023

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

This is so hard to describe through text I just made a miro board to try to logic out all the behavior. Take a look at it if you want and add comments as you see fit

Feb 6 2023, 8:59 AM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Jan 31 2023

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

After testing more, it seems better to explicitly define options event-stream-name and event-stream-prefix instead of trying to derive the stream name and prefix from the table name. We need the unprefixed stream name to look up the schema, and the prefix is only needed if the user is trying to insert into kafka.

Jan 31 2023, 8:26 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Jan 29 2023

tchin created T328232: Support topics without a schema in Flink Catalog.
Jan 29 2023, 6:19 PM · Data-Engineering-Planning, Event-Platform Value Stream
tchin updated the task description for T328211: Support NULL values in RowData in eventutilities.
Jan 29 2023, 7:58 AM · Data-Engineering-Planning, Event-Platform Value Stream
tchin created T328211: Support NULL values in RowData in eventutilities.
Jan 29 2023, 7:55 AM · Data-Engineering-Planning, Event-Platform Value Stream

Jan 21 2023

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

Some updates after some discussion

Jan 21 2023, 11:52 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Dec 20 2022

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

Perhaps the end goal would have a user experience like:

Dec 20 2022, 6:44 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Dec 14 2022

tchin added a comment to T324114: Flink + Event Platform integration for writing into streams via Table API.

After experimenting a bit, I was able to get the catalog to override the schema option of specific tables using

ALTER TABLE `mediawiki.api-request` SET ('schema'='0.0.1');
Dec 14 2022, 6:47 AM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Dec 13 2022

tchin added a comment to T324114: Flink + Event Platform integration for writing into streams via Table API.

Are these so bad? Could we default to the latest version for both sources and sinks, but allow SQL hints to override?

They're mostly just wordy. If we don't really care, then it's actually not bad at all.

INSERT INTO `eventgate-main.test.event` /*+ OPTIONS('schema'='1.0.0') */ (`test`, `test_map`) VALUES ('test_from_catalog', MAP['test_key', 'test_val']);
Dec 13 2022, 9:02 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Dec 12 2022

tchin added a comment to T324114: Flink + Event Platform integration for writing into streams via Table API.

Trying to create different functionalities when sourcing/sinking is like trying to fit a square peg into a round hole, due to the fact that you can only really specify global options for all tables created by the catalog.

Dec 12 2022, 11:54 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Dec 6 2022

tchin claimed T324114: Flink + Event Platform integration for writing into streams via Table API.
Dec 6 2022, 2:09 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning
tchin added a parent task for T324114: Flink + Event Platform integration for writing into streams via Table API: T322022: Flink SQL queries should access Kafka topics from a Catalog.
Dec 6 2022, 4:04 AM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning
tchin added a subtask for T322022: Flink SQL queries should access Kafka topics from a Catalog: T324114: Flink + Event Platform integration for writing into streams via Table API.
Dec 6 2022, 4:04 AM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Nov 30 2022

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

The catalog is now able to sink to a specific prefixed topic by overriding the ResolvedCatalogTable before passing it to the Kafka connector

Nov 30 2022, 2:11 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Nov 29 2022

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

Here's the working code so far, sans the stuff I talk about below

Nov 29 2022, 8:31 AM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Nov 23 2022

tchin added a comment to T322022: Flink SQL queries should access Kafka topics from a Catalog.

I was able to implement a flink catalog that acts as an options passthrough to the built-in kafka connector and uses eventutilities to dynamically create the tables with schemas and with sensible defaults. So something like

CREATE CATALOG wmfeventcatalog WITH (
	'type' = 'wmfeventcatalog',
	'properties.group.id' = 'catalog-test'
);

Will let you use a table like eventgate-main.test.event as if you made it like

CREATE TABLE `eventgate-main.test.event` (
	$schema STRING,
	meta ROW<...>,
	test STRING,
	test_map MAP<STRING, STRING>
) WITH (
	'connector' = 'kafka',
        'format' = 'json',
        'topic' = 'eqiad.eventgate-main.test.event;codfw.eventgate-main.test.event',
	'properties.group.id' = 'catalog-test',
        'properties.bootstrap.servers' = 'kafka-jumbo1001.eqiad.wmnet:9092',  
	'scan.startup.mode' = 'latest-offset',
	'json.timestamp-format.standard' = 'ISO-8601'
);
Nov 23 2022, 8:11 AM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Nov 15 2022

tchin claimed T322022: Flink SQL queries should access Kafka topics from a Catalog.
Nov 15 2022, 2:07 PM · Event-Platform Value Stream (Sprint 09), Data-Engineering-Planning

Nov 9 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

(When it comes to this task of making an example reading/writing with Flink SQL and a UDF, With Andrew's example and also a more simplified example in the example repo, this can be marked as done; although there are still good conversations here)

Nov 9 2022, 3:58 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning
tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

I think the custom type inference in Java/Scala is really powerful, but if someone already is at a point where they're writing UDFs in Java then they probably already having working knowledge of DataTypes, When it comes to UX for people who only want to write python it might be worth just reimplementing JsonSchemaFlinkConverter, although I don't know how difficult that would be. Having something like

@flink_udf(output_schema="fragment/mediawiki/state/entity/revision_slots")

looks very sleek. (although it feels a bit wrong to have 2 codebases that do the same thing)

Nov 9 2022, 2:56 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Nov 8 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

So I was using Kafka Client 3.2.3, but I noticed you were using 2.4.1. Switched to that and it solves the cluster authorization issue. Gonna have to note that somewhere

Nov 8 2022, 4:47 AM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Nov 7 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

The exact error I get is org.apache.kafka.common.errors.ClusterAuthorizationException: Cluster authorization failed when trying to produce to a topic. Tried the test topic and the platform-wiki-image-links topic from Gabriele's first event POC. Consuming from the topic works, so I wonder if there's different default permissions for producing/consuming.

Nov 7 2022, 2:00 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Nov 4 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

Huh we've been doing everything on Yarn so far so I guess I overlooked this, but if I wanted to produce to Kafka or Hadoop using the SQL Cli I would need to generate a Kerberos keytab. Yarn is the only deployment method where you can forego the keytab and use the ticket cache. Looking at wikitech, it seems like generating a keytab is a non-trivial task? Or at least it's something I don't have permission to do. I feel like this is starting to get into the 'figuring out deployment' issue. @Ottomata

Nov 4 2022, 1:55 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Nov 1 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

The UDFs appear to be being executing inside of a process.

Nov 1 2022, 1:12 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Oct 31 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

A problem with virtualenvs is that they don't include the python executable.

Can you elaborate on that? I thought the executable is venv/bin/python3

Oct 31 2022, 9:36 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Oct 29 2022

tchin added a project to T320968: Easy Flink Python UDF + SQL enrichment: Spike.
Oct 29 2022, 9:22 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning
tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

I wrote out the steps on how to package a python virtual environment so that people can use external dependencies in the UDFs

Oct 29 2022, 9:21 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Oct 26 2022

tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

To be fair, the actual idea is easy enough to implement for simple mappings

def python_to_flink_datatype(val: type) -> DataType:
    if val is str:
        return DataTypes.STRING()
    elif val is int:
        return DataTypes.INT()
    elif val is bool:
        return DataTypes.BOOLEAN()
Oct 26 2022, 11:51 AM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning
tchin added a comment to T320968: Easy Flink Python UDF + SQL enrichment.

I definitely feel like the biggest issue here is how we'd map from python types to pyflink DataTypes. Would a python int turn into a DataTypes.INT or perhaps a DataTypes.BIGINT? It's hard to say how well abstracting away the types would go considering the DataTypes are supposed to represent the columns that are being sunk to.

Oct 26 2022, 11:41 AM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Oct 24 2022

tchin updated the task description for T320968: Easy Flink Python UDF + SQL enrichment.
Oct 24 2022, 12:58 PM · Event-Platform Value Stream (Sprint 04), Spike, Data-Engineering-Planning

Oct 14 2022

tchin added a comment to T318859: [SPIKE] Build simple stateless service using PyFlink.

Here's the repo with example datastream and table equivalent. It reads from the mediawiki.page-create and then uses its page_id to fetch the list of images on the page from the action api. I'm also working on a summary writeup of what I've experienced

Oct 14 2022, 9:31 PM · Data-Engineering-Planning, Event-Platform Value Stream (Sprint 03), Spike

Oct 4 2022

tchin moved T318859: [SPIKE] Build simple stateless service using PyFlink from Next Up to In Progress on the Event-Platform Value Stream (Sprint 02) board.
Oct 4 2022, 5:10 AM · Data-Engineering-Planning, Event-Platform Value Stream (Sprint 03), Spike

Sep 20 2022

tchin added a member for Event Streams Planning: tchin.
Sep 20 2022, 7:20 PM

Sep 8 2022

tchin added a comment to T313202: Improve EventGate's error message when the client's HTTP Content-Type is not the one expected.

Opened a pull request on GitHub @Ottomata

Sep 8 2022, 7:30 AM · Patch-For-Review, Data-Engineering-Planning, Event-Platform Value Stream (Sprint 01)

Sep 1 2022

tchin moved T314389: [SPIKE] Decide on technical solution for page state stream backfill process from In Progress to Done on the Event-Platform Value Stream (Sprint 00) board.
Sep 1 2022, 5:10 AM · Data-Engineering, Event-Platform Value Stream (Sprint 00), Spike
tchin updated the task description for T314389: [SPIKE] Decide on technical solution for page state stream backfill process.
Sep 1 2022, 5:09 AM · Data-Engineering, Event-Platform Value Stream (Sprint 00), Spike

Aug 29 2022

tchin added a comment to T314389: [SPIKE] Decide on technical solution for page state stream backfill process.

Ok so to summarize:

Aug 29 2022, 8:47 PM · Data-Engineering, Event-Platform Value Stream (Sprint 00), Spike

Aug 17 2022

tchin added a comment to T314389: [SPIKE] Decide on technical solution for page state stream backfill process.

Do you have a feel for how mature the iceberg connector is?

Aug 17 2022, 9:10 PM · Data-Engineering, Event-Platform Value Stream (Sprint 00), Spike

Aug 16 2022

tchin added a comment to T314389: [SPIKE] Decide on technical solution for page state stream backfill process.
  • Are we backfilling both the page state change stream and/or the one with content?
  • Do we want both the full history and/or a compacted one with only the most recent revision?
  • The two obvious options for backfill is either Spark or Flink
    • Upside of Flink is that we can potentially reuse code written for the stream
    • Upside of Spark is that it's more established and there are more people in the foundation who know how to use and support it
    • Also, depending on what the page state schema will contain, we might have to join the wikitext history table (avro) with the mediawiki history table (parquet)
      • Flink 1.15 does not have full parquet support (can't read complex data types)
      • Flink + iceberg is a thing, but I haven't tried it yet. It does seem to have a library for parquet support
        • Tangent: I've heard iceberg mentioned before, but how does it factor into the rest of the foundation's tech stack? A superficial search on Wikitech brought up nothing
Aug 16 2022, 9:41 PM · Data-Engineering, Event-Platform Value Stream (Sprint 00), Spike

Jul 22 2022

tchin created T313628: Define how to authenticate with Cassandra and test Flink POC.
Jul 22 2022, 10:50 PM · Event-Platform Value Stream (Sprint 00), Data-Engineering, Spike

Jun 30 2022

tchin added a comment to T311070: [Shared Event Platform][NEEDS GROOMING] We should standardize Flink app config for yarn (development) deployments.

I made it work with this:

Jun 30 2022, 5:28 PM · Data-Engineering-Planning, Event-Platform Value Stream
tchin added a comment to T311070: [Shared Event Platform][NEEDS GROOMING] We should standardize Flink app config for yarn (development) deployments.

ParameterTool will let you have external config, but, it is not a Configuration object, so you'd need to convert it by doing something like:

val env = StreamExecutionEnvironment.getExecutionEnvironment
val parameters = ParameterTool.fromPropertiesFile("config.properties")
val config = Configuration.fromMap(parameters.toMap)
Jun 30 2022, 7:49 AM · Data-Engineering-Planning, Event-Platform Value Stream

Jun 13 2022

tchin added a comment to T293808: Design Image Suggestion Schema.

Is the user column under the feedback table supposed to be text? The feedback event schema currently outputs a user_id instead so I'm wondering if it's supposed to be transformed into a username or if the Cassandra table needs to be updated

Jun 13 2022, 3:27 PM · Generated Data Platform

Mar 31 2022

tchin created T305193: Access for new Data Platform Dev: Thomas Chin.
Mar 31 2022, 8:00 PM · SRE, SRE-Access-Requests

Mar 4 2022

tchin added a comment to T300935: Investigate what's required to allow a user to fork or transfer a project to a group.

I can sort of get around this issue if I go to the the project group page (in my case repos/api-platform) > New project > Import project > import by URL or by GitLab exports. That seems like the only time the project group shows up as an option in the dropdown for me. It doesn't show up if I use any other import options, or if I get to the import page through the '+' button on the website header instead of the 'New project' button on the project group page.

Mar 4 2022, 12:17 PM · Release-Engineering-Team, User-brennen, GitLab (Auth & Access)

Feb 22 2022

tchin added a comment to T295053: Populate API Portal: API catalog from service catalog.

Gitlab group request ticket -> T301164

Feb 22 2022, 2:20 PM · API Platform (API Portal Roadmap)

Feb 16 2022

tchin committed rMSSN82d1a7ae9d51: Add config.yaml (authored by tchin).
Add config.yaml
Feb 16 2022, 4:12 PM

Feb 7 2022

tchin created T301164: Create new GitLab project group: API Platform.
Feb 7 2022, 5:51 PM · Release-Engineering-Team (Done by Feb 23 🧟), GitLab (Project Migration)
tchin added a member for API Platform: tchin.
Feb 7 2022, 5:09 PM

Jan 25 2022

tchin closed T291288: Replace Title::newFromIDs and TitleFactory::newFromIDs with PageQueryBuilder as Resolved.
Jan 25 2022, 6:19 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, MediaWiki-General, Platform Team Workboards (MW Expedition)

Jan 12 2022

tchin added a comment to T294739: remove configuration access via global variables in core.

Forgot to link the patch that replaces most of the easily replaceable ones. T297797 lists places where it wasn't easily removable.

Jan 12 2022, 4:42 PM · MW-1.39-notes (1.39.0-wmf.18; 2022-06-27), MW-1.38-notes (1.38.0-wmf.24; 2022-02-28), Patch-For-Review, MediaWiki-SettingsBuilder

Jan 7 2022

tchin added a comment to T297797: Fix unit tests that test methods that use globals.

I basically just converted every global $wgGlobal into MediaWikiServices::getInstance()->getMainConfig()->get( 'Global' ) to see what blows up and then backtracked from there.

Jan 7 2022, 3:36 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests

Dec 20 2021

tchin updated the task description for T297797: Fix unit tests that test methods that use globals.
Dec 20 2021, 8:33 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests
tchin updated the task description for T297797: Fix unit tests that test methods that use globals.
Dec 20 2021, 8:31 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests
tchin updated the task description for T297797: Fix unit tests that test methods that use globals.
Dec 20 2021, 8:03 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests
tchin updated the task description for T297797: Fix unit tests that test methods that use globals.
Dec 20 2021, 7:49 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests
tchin updated the task description for T297797: Fix unit tests that test methods that use globals.
Dec 20 2021, 6:07 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests
tchin updated the task description for T297797: Fix unit tests that test methods that use globals.
Dec 20 2021, 1:57 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests

Dec 15 2021

tchin created T297797: Fix unit tests that test methods that use globals.
Dec 15 2021, 3:20 PM · Platform Team Workboards (MW Expedition), MediaWiki-SettingsBuilder, MediaWiki-Core-Tests
tchin moved T292683: WikiPage::doUpdateRestrictions should become a page command from Doing to Waiting for Review on the Platform Team Workboards (MW Expedition) board.
Dec 15 2021, 3:00 PM · Patch-For-Review, Platform Team Workboards (MW Expedition), MediaWiki-General

Nov 17 2021

tchin updated the task description for T291398: Turn usage of AJAX interface to API Modules (1).
Nov 17 2021, 3:47 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, Platform Engineering Code Jam, Platform Team Workboards (MW Expedition), Technical-Debt, Readers-Web-Backlog (Tracking), Collection

Nov 12 2021

tchin moved T292683: WikiPage::doUpdateRestrictions should become a page command from Unsorted pile to Doing on the Platform Team Workboards (MW Expedition) board.
Nov 12 2021, 8:31 PM · Patch-For-Review, Platform Team Workboards (MW Expedition), MediaWiki-General

Oct 19 2021

tchin updated the task description for T291288: Replace Title::newFromIDs and TitleFactory::newFromIDs with PageQueryBuilder.
Oct 19 2021, 5:12 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, MediaWiki-General, Platform Team Workboards (MW Expedition)

Oct 13 2021

tchin updated the task description for T291288: Replace Title::newFromIDs and TitleFactory::newFromIDs with PageQueryBuilder.
Oct 13 2021, 3:18 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, MediaWiki-General, Platform Team Workboards (MW Expedition)

Oct 6 2021

tchin changed the status of T291288: Replace Title::newFromIDs and TitleFactory::newFromIDs with PageQueryBuilder from Open to In Progress.
Oct 6 2021, 5:14 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, MediaWiki-General, Platform Team Workboards (MW Expedition)

Oct 1 2021

tchin updated the task description for T291288: Replace Title::newFromIDs and TitleFactory::newFromIDs with PageQueryBuilder.
Oct 1 2021, 8:16 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, MediaWiki-General, Platform Team Workboards (MW Expedition)
tchin moved T291288: Replace Title::newFromIDs and TitleFactory::newFromIDs with PageQueryBuilder from Unsorted pile to Doing on the Platform Team Workboards (MW Expedition) board.
Oct 1 2021, 8:04 PM · MW-1.38-notes (1.38.0-wmf.12; 2021-12-06), Patch-For-Review, MediaWiki-General, Platform Team Workboards (MW Expedition)