In T338065: [Iceberg Migration] Implement mechanism for automatic Iceberg table maintenance we developed a mechanism that can do typical table maintenance for Iceberg table.
However, we did not implement support for the rewrite_data_files() Spark procedure. We did this to control the scope of the task, and also because there is currently no need for this mechanism. However, once we migrate wmf.event_sanitized we will definitely need this, as this dataset is currently the biggest offender in terms of amount of files in HDFS.
In this task, we should extend this mechanism to support rewrite_data_files().