Page MenuHomePhabricator

mediawiki_history_snapshot_config_dag fails since the last change about the AQS config table
Open, Needs TriagePublicBUG REPORT

Description

mediawiki_history_snapshot_config_dag is failing while trying to insert the tuple:

24/04/02 10:19:58 ERROR AppendDataExec: Data source write support CassandraBulkWrite(org.apache.spark.sql.SparkSession@3fae357f,com.datastax.spark.connector.cql.CassandraConnector@30d3f583,TableDef(aqs,config,ArrayBuffer(ColumnDef(param,PartitionKeyColumn,VarCharType)),ArrayBuffer(),Stream(ColumnDef(value,RegularColumn,VarCharType)),Stream(),false,false,Map()),WriteConf(RowsInBatch(1024),1000,Partition,LOCAL_QUORUM,false,false,5,None,TTLOption(DefaultValue),TimestampOption(DefaultValue),true,None),StructType(StructField(param,StringType,false), StructField(value,StringType,false)),org.apache.spark.SparkConf@48064a5a) aborted.
24/04/02 10:19:58 ERROR SparkSQLDriver: Failed in [
INSERT INTO ${aqs_config_table}
SELECT
    '${property_name}' AS param,
    '${property_value}' AS value
]

To see the full log:
sudo -u analytics yarn logs -appOwner analytics -applicationId application_1707226456123_385652

It seems it's the first time this dag is executed after the change about the aqs config table to store the recently create mediawiki_history_snapshot

Details

TitleReferenceAuthorSource BranchDest Branch
Make driver and executor memory configurablerepos/data-engineering/airflow-dags!672milimetricdebug-mediawiki-history-snapshot-configmain
Customize query in GitLab

Event Timeline

@VirginiaPoundstone We think this bug is not urgent because it's related to a change we haven't deployed yet (the one about automating mediawiki_history_snaphot for edit and editor analytics services)