Replace Redis queue with custom http solution
Closed, ResolvedPublicFeature
Actions

Assigned To

Authored By

	bd808
	Apr 1 2024, 8:26 PM

Description

As hinted in T360860#9656485 I have been thinking for a while about replacing Redis in the wikibugs2 stack with a custom service. After doing a bit of proof of concept work I think I have the outline for a reasonable solution. The solution piggybacks off of the implementation for T360860: Reimagine channel configuration (re)loading to avoid need for git pull by using the wikibugs2 web component as the storage and distribution service for the work queue.

PUT /api/event - Push a new event into the RPC queue watched by wikibugs2 irc. Callable by any wikibugs2 component.
GET /api/eventstream - Establish a Server Sent Events (SSE) session that will receive server push notifications of newly enqueued events. Callable by wikibugs2 irc.

The initial events supported by this system will be:

irc send event - A pre-formatted IRC message & list of channels to send it to. This can be used by any component that has something to say on irc, but most typically will be used by wikibugs2 gerrit to notify channels of code review events.
phorge event - Data collected from Phorge about a Manifest task transaction. These will be produced by wikibugs2 phorge.
ping event - A event injected into the event stream by wikibugs2 web periodically to help ensure that the push path from the server to its attached client is functioning.

This design makes the wikibugs2 web webservice a newly stateful system that may lose data previously sent by a client upon crash/restart. It is currently believed that this potential data loss will not be more disruptive than the current stateful wikibugs2 gerrit and wikibugs2 phorge data collectors have proven to be. The working hypothesis is that the overall stability of the system will be improved by removing the currently untrustworthy Redis queue which is backed by stateful disk storage, but also requires traversing the Kubernetes<->Cloud VPS network boundary to both store and retrieve events.

Details

	Title	Reference	Author	Source Branch	Dest Branch
	Replace Redis queue with custom http solution	toolforge-repos/wikibugs2!28	bd808	work/bd808/sse	main

Customize query in GitLab

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T360596 Figure out a plan to move forward with regarding Redis License changes
		Resolved	Feature	bd808	T361518 Replace Redis queue with custom http solution

Event Timeline

bd808 created this task.Apr 1 2024, 8:26 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 1 2024, 8:26 PM

bd808 changed the task status from Open to In Progress.Apr 1 2024, 8:26 PM

bd808 claimed this task.

bd808 triaged this task as Medium priority.

Restricted Application added a project: User-bd808. · View Herald TranscriptApr 1 2024, 8:26 PM

I have code running in my local environment for the whole stack without Redis anywhere! It needs a bit more polish before I push to gitlab and start testing it at scale in the wikibugs-testing deployment, but that should happen very soon.

bd808 opened https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/28

Replace Redis queue with custom http solution

bd808 added a parent task: T360596: Figure out a plan to move forward with regarding Redis License changes.Apr 5 2024, 11:47 PM

Mentioned in SAL (#wikimedia-cloud) [2024-04-05T23:52:57Z] <wmbot~bd808@tools-sgebastion-10> Build new container based on MR!28 and restarted web, irc, gerrit, and phorge tasks (T361518)

Maintenance_bot removed a project: Patch-For-Review.Apr 6 2024, 12:30 AM

bd808 mentioned this in T288381: Connect WikiBugs IRC bot to Wikimedia GitLab.Apr 6 2024, 7:41 PM

bd808 merged https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/28

Replace Redis queue with custom http solution

Mentioned in SAL (#wikimedia-cloud) [2024-04-08T16:29:17Z] <wmbot~bd808@tools-bastion-12> Built new image from git hash 0c4ecb64. (T361518)

Mentioned in SAL (#wikimedia-cloud) [2024-04-08T16:36:55Z] <wikibugs> Restarted web, irc, gerrit, and phorge tasks to pick up new image. (T361518)

bd808 closed this task as Resolved.Apr 9 2024, 3:12 AM

bd808 mentioned this in T362288: [gitlab-webhooks] Provide a server-sent events API for rebroadcast of GitLab webhook data.Apr 10 2024, 9:55 PM

Replace Redis queue with custom http solutionClosed, ResolvedPublicFeatureActions

Description

Details

Related ObjectsSearch...

Event Timeline

Replace Redis queue with custom http solution
Closed, ResolvedPublicFeature
Actions

Related Objects
Search...