Page MenuHomePhabricator

Modify pipelines to leverage Spark 3.3 Shuffler
Closed, ResolvedPublic2 Estimated Story Points

Description

On T344910, we deployed additional Spark Shufflers to our cluster so that we can support Spark 3.3 and Spark 3.4 lines.

We currently use Spark 3.3 on the Dumps 2.0 pipelines.

In this task we should update the jobs so that they leverage the Spark 3.3 shuffler.

  • All Dump 2.0 pipelines run with the Spark Shuffler.

Event Timeline

xcollazo set the point value for this task to 2.Dec 6 2023, 4:54 PM

Mentioned in SAL (#wikimedia-analytics) [2023-12-07T21:45:30Z] <xcollazo> Deployed latest changes to Airflow Analytics instance to pickup T352890

Deployed to production, working as expected.