See T368753 for details.
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | xcollazo | T358877 Dumps 2.0 Phase II: Production intermediate table milestone | |||
| Resolved | xcollazo | T358373 [Dumps 2] Reconciliation mechanism to detect and fetch missing/mismatched revisions | |||
| Resolved | xcollazo | T368753 Implement production mechanism that emits (wiki_db, revision_id) pairs for missing or inaccurate rows | |||
| Resolved | xcollazo | T368756 Airflow job to orchestrate the dumps reconcilliation emission mechanism |
Event Timeline
xcollazo updated https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/774
Emit mismatch rows
xcollazo merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/774
Detect inconsistent rows from wmf_dumps.wikimedia_raw
Mentioned in SAL (#wikimedia-analytics) [2024-07-31T19:25:25Z] <xcollazo> Ran DDL for wmf_dumps.wikitext_inconsistent_rows_rc1 https://gitlab.wikimedia.org/repos/data-engineering/dumps/mediawiki-content-dump/-/blob/91a343657d8053858a6c086004c549b9aa3245c0/hql/create-wmf_dumps_wikitext_inconsistent_rows.hql T368756
Mentioned in SAL (#wikimedia-operations) [2024-07-31T21:16:01Z] <xcollazo@deploy1003> Started deploy [airflow-dags/analytics@82674dc]: deploy hot airflow analytics dag hot fix T368756
Mentioned in SAL (#wikimedia-operations) [2024-07-31T21:17:07Z] <xcollazo@deploy1003> Finished deploy [airflow-dags/analytics@82674dc]: deploy hot airflow analytics dag hot fix T368756 (duration: 01m 05s)
xcollazo updated https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/783
Hot fix: Bump conda artifact for mediawiki-content-dump.
xcollazo opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/785
Add a dblist include/exclude mechanism for dumps_reconcile_wikitext_raw_daily.
xcollazo merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/785
Add a dblist include/exclude mechanism for dumps_reconcile_wikitext_raw_daily.
Mentioned in SAL (#wikimedia-operations) [2024-08-02T16:00:00Z] <xcollazo@deploy1003> Started deploy [airflow-dags/analytics@d573c40]: Deploy latest DAGs for analytics Airflow instance. T368756
Mentioned in SAL (#wikimedia-operations) [2024-08-02T16:01:02Z] <xcollazo@deploy1003> Finished deploy [airflow-dags/analytics@d573c40]: Deploy latest DAGs for analytics Airflow instance. T368756 (duration: 01m 02s)
From myself from Slack:
Folks, I’m going to be OOO next week. I noticed that there are transient issues with dumps_reconcile_wikitext_raw_daily, so I am going to leave it paused for the time being. Dynamic Task Mapping’s Airflow UI is super confusing BTW…
I have now returned from OOO, and have deleted old runs with mismatch operators, and reset the DAG with a start date of 2024-08-11. Will now test whether we see these issues again.
Mentioned in SAL (#wikimedia-analytics) [2024-08-12T17:06:08Z] <xcollazo> Ran " ALTER TABLE wmf_dumps.wikitext_inconsistent_rows_rc1 SET TBLPROPERTIES ( 'commit.retry.num-retries' = '10' ); ". T368756.
This ALTER should solve most sporadic failures due to Iceberg commit retries exhaustion. Iceberg tries 4 times by default, bumping to 10. Will reflect this change in code shortly.
2024-08-11 run took 01:21:16 and finished successfully with what seems like only one sporadic failure till we applied the ALTER above.
For completeness, this is what I ran in production:
ssh an-launcher1002.eqiad.wmnet sudo -u analytics bash kerberos-run-command analytics spark3-sql ALTER TABLE wmf_dumps.wikitext_inconsistent_rows_rc1 SET TBLPROPERTIES ( 'commit.retry.num-retries' = '10' );
xcollazo updated https://gitlab.wikimedia.org/repos/data-engineering/dumps/mediawiki-content-dump/-/merge_requests/31
Bump commit.retry.num-retries for wikitext_inconsistent_rows.
xcollazo merged https://gitlab.wikimedia.org/repos/data-engineering/dumps/mediawiki-content-dump/-/merge_requests/31
Bump commit.retry.num-retries for wikitext_inconsistent_rows.
xcollazo opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/798
Change start date for dumps_reconcile_wikitext_raw_daily
amastilovic merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/798
Change start date for dumps_reconcile_wikitext_raw_daily