With some regularity, like once a week, Refine uses Event Utilities to download a schema via Resource Loader and ends up with an error that looks like the schema was chopped off while being retrieved.
The latest example has this error:
The following 1 of 2 dataset partitions for output table `event`.`mediawiki_structured_task_article_image_suggestion_interaction` failed refinement:
org.wikimedia.analytics.refinery.job.refine.RefineTargetException: Failed refinement of hdfs://analytics-hadoop/wmf/data/raw/event/codfw.mediawiki.structured_task.article.image_suggestion_interaction/year=2022/month=05/day=31/hour=21 -> `event`.`mediawiki_structured_task_article_image_suggestion_interaction` /wmf/data/event/mediawiki_structured_task_article_image_suggestion_interaction/datacenter=codfw/year=2022/month=5/day=31/hour=21. Original exception: while scanning a simple key
in 'reader', line 214, column 5:
mini
^
could not find expected ':'
in 'reader', line 214, column 9:
mini
^I'm looking through the full logs (from an-launcher1002, sudo -u analytics kerberos-run-command analytics yarn logs -applicationId application_1651744501826_136341 | less). But looking around it looks like it got the schema just fine, and there are even mentions of successful refinement *before* the failure messages. So I'm confused. Here are some of the relevant log lines in order:
22/06/01 00:22:25 INFO DataFrameToHive: Writing DataFrame to /wmf/data/event/mediawiki_structured_task_article_image_suggestion_interaction/datacenter=eqiad/year=2022/month=5/day=31/hour=21 with schema:
root
|-- _schema: string (nullable = true)
|-- action: string (nullable = true)
|-- action_data: string (nullable = true)
|-- active_interface: string (nullable = true)
|-- dt: string (nullable = true)
|-- homepage_pageview_token: string (nullable = true)
|-- http: struct (nullable = true)
| |-- client_ip: string (nullable = true)
| |-- has_cookies: boolean (nullable = true)
| |-- method: string (nullable = true)
| |-- protocol: string (nullable = true)
| |-- request_headers: map (nullable = true)
| | |-- key: string
| | |-- value: string (valueContainsNull = true)
| |-- response_headers: map (nullable = true)
| | |-- key: string
| | |-- value: string (valueContainsNull = true)
| |-- status_code: long (nullable = true)
|-- is_mobile: boolean (nullable = true)
|-- meta: struct (nullable = true)
| |-- domain: string (nullable = true)
| |-- dt: string (nullable = true)
| |-- id: string (nullable = true)
| |-- request_id: string (nullable = true)
| |-- stream: string (nullable = true)
| |-- uri: string (nullable = true)
|-- newcomer_task_token: string (nullable = true)
|-- page_id: long (nullable = true)
|-- page_title: string (nullable = true)
|-- geocoded_data: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- user_agent_map: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- is_wmf_domain: boolean (nullable = true)
|-- normalized_host: struct (nullable = true)
| |-- project_class: string (nullable = true)
| |-- project: string (nullable = true)
| |-- qualifiers: array (nullable = true)
| | |-- element: string (containsNull = true)
| |-- tld: string (nullable = true)
| |-- project_family: string (nullable = true)
|-- datacenter: string (nullable = true)
|-- year: long (nullable = true)
|-- month: long (nullable = true)
|-- day: long (nullable = true)
|-- hour: long (nullable = true)
22/06/01 00:22:08 ERROR Refine: Failed refinement of dataset hdfs://analytics-hadoop/wmf/data/raw/event/codfw.mediawiki.structured_task.article.image_suggestion_interaction/year=2022/month=05/day=31/hour=21 -> `event`.`mediawiki_structured_task_article_image_suggestion_interaction` /wmf/data/event/mediawiki_structured_task_article_image_suggestion_interaction/datacenter=codfw/year=2022/month=5/day=31/hour=21.
while scanning a simple key
in 'reader', line 214, column 5:
mini
^
could not find expected ':'
in 'reader', line 214, column 9:
mini
^
22/06/01 00:22:41 INFO Refine: Finished refinement of dataset hdfs://analytics-hadoop/wmf/data/raw/event/eqiad.mediawiki.structured_task.article.image_suggestion_interaction/year=2022/month=05/day=31/hour=21 -> `event`.`mediawiki_structured_task_article_image_suggestion_interaction` /wmf/data/event/mediawiki_structured_task_article_image_suggestion_interaction/datacenter=eqiad/year=2022/month=5/day=31/hour=21. (# refined records: 57)And later on, maybe this is just a summary of above?
22/06/01 00:36:50 ERROR Refine: The following 1 of 2 dataset partitions for output table `event`.`mediawiki_structured_task_article_image_suggestion_interaction` failed refinement:
org.wikimedia.analytics.refinery.job.refine.RefineTargetException: Failed refinement of hdfs://analytics-hadoop/wmf/data/raw/event/codfw.mediawiki.structured_task.article.image_suggestion_interaction/year=2022/month=05/day=31/hour=21 -> `event`.`mediawiki_structured_task_article_image_suggestion_interaction` /wmf/data/event/mediawiki_structured_task_article_image_suggestion_interaction/datacenter=codfw/year=2022/month=5/day=31/hour=21. Original exception: while scanning a simple key
in 'reader', line 214, column 5:
mini
^
could not find expected ':'
in 'reader', line 214, column 9:
mini
^
22/06/01 00:36:50 INFO Refine: Successfully refined 1 of 2 dataset partitions into table `event`.`mediawiki_structured_task_article_image_suggestion_interaction` (total # refined records: 56)