In 2015, we increased the limit from 1000 to 2000 (T112002).
I think it's time we look ahead at potentially increasing this to 5000 sometime soon. This is not yet a priority given our current schemas are (mostly) consistently within the limits, but a few use cases are slowly emerging that might benefit from a higher limit.
This can sit in the backlog as thinking deposit until we need it.
Relevant bits for implementation
- The storing, validating and processing of events in Kafka and in EventLogging's Python code has no known limitations.
- The Nginx TLS-proxy has limits the URI to 16K (large_client_header_buffers)
- The ingestion point in Varnish has a limit of 2048 bytes (vsl_reclen).
- The client-side has a self-imposed limit of 2000 characters with StatsD logging if it exceeds that limit. (ext.eventLogging)
Potential concerns
Browsers are believed to support without issue beacons having a url of 5000 bytes in size. This means an increase from 2K to 4K shouldn't cause any problems (e.g. with the beacon being internally seen as dispatched but ultimately not being sent or corrupted without our client knowing this).
In 2016, Performance Team previously researched this in the context of load.php urls ($wgResourceLoaderMaxQueryLength) which we have raised from 2000 to 5000 in WMF production. The bottleneck there was IE 9, which supported only up to 5000 characters for the query string. Before that, the bottle neck was IE 8 (limited to 2000 chars), for which JavaScript support was dropped (MediaWiki actively disables JS in IE 8).
Given that IE 9 was the bottle neck at the time, and we have since discontinued JavaScript support for IE 9 and IE 10, we may be able to go beyond 5000, from the browser's perspective anyway.
Most web servers and proxies are believed to support urls of 5000 bytes or longer without issue. At WMF at least, we know load.php urls with 5000 characters work fine (Nginx, Varnish, Apache, HHVM fcgi).
Lastly, EventLogging itself currently receives beacons via varnishkafka from Varnish-SHM which has a configurable limit we currently set to 2048 bytes. This would need to be raised.