What happens?:
- when snapshots DAG is running
- we are getting errors when decoding messages (see logs in the notes sections)
What should have happened instead?:
- all of the messages should be successfully decoded
Other information (browser name/version, screenshots, etc.):
Snapshots DAG logs:
[2023-11-21, 00:25:41 UTC] {snapshots.py:126} INFO - Finished an export job for project: zhwikivoyage in namespace: 10 with total: 53 and errors: 18
[2023-11-21, 00:34:02 UTC] {snapshots.py:126} INFO - Finished an export job for project: zhwiki in namespace: 10 with total: 21356 and errors: 4465
[2023-11-21, 04:57:16 UTC] {snapshots.py:126} INFO - Finished an export job for project: ckbwiki in namespace: 0 with total: 51991 and errors: 6246
Service logs example:
2023/11/21 00:25:40 export.go:225: avro unmarshal error for id: zhwikivoyage_namespace_10 with offset: 257 with error: Namespace: avro: decode union type: unknown union type
2023/11/21 00:32:59 export.go:225: avro unmarshal error for id: zhwiki_namespace_10 with offset: 191803 with error: URL: avro: ReadSTRING: invalid string length
2023/11/21 04:57:11 export.go:225: avro unmarshal error for id: ckbwiki_namespace_0 with offset: 88906 with error: WatchersCount: avro: ReadInt: int overflow
2023/11/21 04:57:11 export.go:225: avro unmarshal error for id: ckbwiki_namespace_0 with offset: 88929 with error: URL: avro: ReadSTRING: invalid string length