Page MenuHomePhabricator

Documentation improvements for Eventstreams
Closed, ResolvedPublic2 Estimated Story Points

Description

While working on the parent task, Faidon raised some questions related to ES documentation:

root-injects.jsx:95 TypeError: Cannot read property 'get' of undefined
    at t.value (parameter-row.jsx:175)
    at t.render (root-injects.jsx:93)
    at s._renderValidatedComponentWithoutOwnerOrContext (ReactCompositeComponent.js:796)
id: [{"topic":"eqiad.mediawiki.revision-create","partition":0,"timestamp":1573182905001},{"offset":57092325,"partition":0,"topic":"codfw.mediawiki.revision-create"}]

Why for eqiad we have a timestamp and for codfw an offset? This should be documented somewhere.

Event Timeline

Description of parameters in https://stream.wikimedia.org/?doc#/Streams/get_v2_stream_recentchange (and all the others) seems broken

I believe some of this will be fixed in some recent EventStreams work I did for the k8s migration.

Not 100% sure this fixes, but it def improves things. Will make sure this is fixed. BTW, the OpenAPI spec html interface stuff doesn't really work that well with EventStreams, due to its 'unending HTTP' nature. Often, the OpenAPI spec will allow you to provide an example request and response that can be executed within the interface. This does not work with EventStreams, since the request will never close. But anyway...

Why for eqiad we have a timestamp and for codfw an offset? This should be documented somewhere.

Indeed! This took me a while to remember too. By default, timestamps provided to KafkaSSE as a starting point will be translated to offsets when returned as the SSE id. However, in T199433: Redesign EventStreams for better multi-dc support we added an option useTimestampForId which always uses timestamps in returned id fields instead of offsets. Offsets are not the same in both kafka main-codfw and main-eqiad, so if a consumer uses the offset from one for the other, they will not get the messages they expect. Anyway, Agree this is not documented. Added https://wikitech.wikimedia.org/w/index.php?title=Event_Platform%2FEventStreams&type=revision&diff=1847545&oldid=1841517

Also https://wikitech.wikimedia.org/w/index.php?title=Event_Platform%2FEventStreams&type=revision&diff=1847546&oldid=1847545

This is also documented in the EventStreams README

BTW, I don't think the IRC recentchanges stuff needs to consider historical consumption. The current IRC service doesn't support that now. I think we can always start consuming from latest offset (-1).

Description of parameters in https://stream.wikimedia.org/?doc#/Streams/get_v2_stream_recentchange (and all the others) seems broken

I believe some of this will be fixed in some recent EventStreams work I did for the k8s migration.

Not 100% sure this fixes, but it def improves things. Will make sure this is fixed. BTW, the OpenAPI spec html interface stuff doesn't really work that well with EventStreams, due to its 'unending HTTP' nature. Often, the OpenAPI spec will allow you to provide an example request and response that can be executed within the interface. This does not work with EventStreams, since the request will never close. But anyway...

Just to understand, does this need a deployment of ES or something else?

Why for eqiad we have a timestamp and for codfw an offset? This should be documented somewhere.

Indeed! This took me a while to remember too. By default, timestamps provided to KafkaSSE as a starting point will be translated to offsets when returned as the SSE id. However, in T199433: Redesign EventStreams for better multi-dc support we added an option useTimestampForId which always uses timestamps in returned id fields instead of offsets. Offsets are not the same in both kafka main-codfw and main-eqiad, so if a consumer uses the offset from one for the other, they will not get the messages they expect. Anyway, Agree this is not documented. Added https://wikitech.wikimedia.org/w/index.php?title=Event_Platform%2FEventStreams&type=revision&diff=1847545&oldid=1841517

Also https://wikitech.wikimedia.org/w/index.php?title=Event_Platform%2FEventStreams&type=revision&diff=1847546&oldid=1847545

This is also documented in the EventStreams README

Documentation looks awesome, thanks!

Just to understand, does this need a deployment of ES or something else?

Ya, just haven't done that (need to rebuild scap deploy repo, etc.). I'm not 100% sure the problem is fixed, but I did re-work the way the OpenAPI spec is generated from config.

BTW, I don't think the IRC recentchanges stuff needs to consider historical consumption. The current IRC service doesn't support that now. I think we can always start consuming from latest offset (-1).

Agreed! The problem is that the SSE connection breaks every now and then (for whatever reason). The rate of events is high enough that during the time it takes to reestablish the connection and resume streaming, several messages are lost. In that context, it's useful to resume from the last message that was consumed and continue streaming in realtime since (if that's possible!). So this is not as much "historical consumption" as it is" lossless consumption".

The problem is that the SSE connection breaks every now and then

It does? I've seen some bug reports now and then about this, but they've always been about the client side python libraries having bugs. Do you know if this is a server side problem?

But ya I get your point. Hm, not sure how that could be done in the fake-IRCd model though, since unless it maintains some kind of high water mark state about where each connected client is at. Or creates a new Kafka consumer for each connected client, like EventStreams does?

The problem is that the SSE connection breaks every now and then

It does, but hopefully it won't now! https://phabricator.wikimedia.org/T238658#5982447

Nuria set the point value for this task to 2.