Summary
MWHistoryDeltaWriter (Spark 3.5 job) crashes on executors with java.lang.NoClassDefFoundError deep inside the Iceberg Arrow vectorized Parquet reader:
java.lang.NoClassDefFoundError at org.apache.iceberg.shaded.io.netty.util.internal.shaded.org.jctools.queues.MessagePassingQueueUtil ... at org.apache.iceberg.arrow.vectorized.VectorizedArrowReader.read at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read
Presumed root cause
The cluster now runs two Spark/Iceberg stacks side by side:
- Spark 3.1.2 + Iceberg 1.2.1 (existing)
- Spark 3.5.8 + Iceberg 1.6.1 (new)
Both Iceberg runtime JARs end up on the executor classpath when submitting a Spark 3.5 job. Iceberg 1.2.1 and 1.6.1 ship different shaded versions of Arrow and Netty internally. When the JVM loads Arrow/Netty classes from the 1.2.1 JAR first, the 1.6.1 vectorized reader cannot find its own shaded JCTools classes and crashes.
refinery-job-35/pom.xml correctly declares iceberg-spark-runtime-3.5_2.12:1.6.1 as provided (not shaded in), so the fix is on the cluster configuration side, not in the job.
Current workaround
--conf spark.sql.iceberg.vectorization.enabled=false disables the Arrow vectorized reader entirely, avoiding the conflict. Job runs successfully.
Investigation / fix
Ensure the Iceberg 1.2.1 JAR is not present on the executor classpath when submitting Spark 3.5 jobs — either by isolating the two Spark stacks' classpaths or by adding spark.executor.userClassPathFirst=true so the 1.6.1 JAR takes precedence. Confirm by re-running without vectorization.enabled=false.