After we've started posting the job events into #eventbus and Kafka, we've noticed that some of them were rejected by Kafka because of `MESSAGE_SIZE_TOO_LARGE` error. The limit was increased from 1 Mb to 4Mb, but even after the increase some events are still sometimes rejected. Digging deeper I've found that there are some enormously large jobs.
Example 17 Mb in size originally, I've shortened it.
```
{
"meta": {
"domain": "commons.wikimedia.org",
"uri": "https://commons.wikimedia.org/wiki/Special:Badtitle/JobSpecification",
"topic": "mediawiki.job.wikibase-InjectRCRecords",
"request_id": "faa2e213-fc18-4bd0-9ef1-b716360260a6",
"schema_uri": "mediawiki/job/1",
"dt": "2017-09-07T21:59:38+00:00",
"id": "d80f03e5-9417-11e7-9459-141877615224"
},
"page_title": "Special:Badtitle/JobSpecification",
"database": "commonswiki",
"params": {
"pages": {
"17303097": [6, "'A_Missionary_Preaching_to_the_Natives,_under_a_Skreen_of_platted_Cocoa-nut_leaves_at_Kairua'_by_William_Ellis.jpg"],
"14883442": [6, "PL_J\\u00f3zef_Ignacy_Kraszewski-Lubonie_tom_II_077.jpeg"],
"2653965": [6, "Meyers_b9_s0043.jpg"],
(..a few million further page entries...)
},
"change": {
"info": "...",
"user_id": "194202",
"object_id": "Q36180",
"time": "20170907215252",
"revision_id": "553663638",
"type": "wikibase-item~update",
"id": 550757876
}
},
"type": "wikibase-InjectRCRecords",
"page_namespace": -1
}
```
There's another example that is **44 Mb is size** serialized. Kafka is capable of handling that, but it's not great in dealing with very large messages, so we can't increase the cap indefinitely. Maybe there's something we could do on Wikidata side to reduce the size of these jobs?