Storage request: swift s3 bucket for flink search-update-pipeline checkpointing
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	pfischer
	Jul 25 2023, 11:23 AM

Description

Hi!

We would like to use an S3 backend for our flink-based search update pipeline. This is needed for persisting checkpoints, aka state of stateful operations inside the application, which in turn allows the application to seamlessly pick up where a predecessor left of (when being stopped for whatever reason).

In analogy to T330693 (s3 for flink-based enrichment application) we would need an account to store flink checkpoints and savepoints. This account would have access to three containers:

cirrussearch-update-pipeline-eqiad
cirrussearch-update-pipeline-codfw
cirrussearch-update-pipeline-staging

The storage needs for each container (excluding staging) would be 21G.

The storage needs for the staging container should be minimal as it will be used for staging deploys for which we'll probably cover only one test wiki so in total the account needs a storage quota twice as big as the values stated above.

This space is primarily used for operating flink, the loss of this state should not be terrible and can be solved by restarting the job from earlier kafka offsets (~5min).
See for the detailed estimation checkpoint storage estimation.

Thanks for looking and please let us know if you need more info.

Details

	Subject	Repo	Branch	Lines +/-
	hiera: add swift user search_update_pipeline	operations/puppet	production	+5 -0
	hiera: add fake credential for swift user search_update_pipeline	labs/private	master	+1 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T317045 [Epic] Re-architect the Search Update Pipeline
Open	None	T340548 [EPIC] Deployment of the Search Update Pipeline on Flink / k8s
Resolved	MatthewVernon	T342620 Storage request: swift s3 bucket for flink search-update-pipeline checkpointing

Event Timeline

pfischer created this task.Jul 25 2023, 11:23 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 25 2023, 11:23 AM

dcausse updated the task description. (Show Details)Jul 25 2023, 1:22 PM

bking added a project: Data-Persistence.Jul 25 2023, 1:34 PM

bking added a project: SRE-swift-storage.Jul 25 2023, 1:45 PM

Gehel moved this task from needs triage to Current work on the Discovery-Search board.Aug 7 2023, 3:13 PM

Gehel edited projects, added Discovery-Search (Current work); removed Discovery-Search.

Gehel moved this task from Incoming to Blocked/Waiting on the Discovery-Search (Current work) board.Aug 7 2023, 3:31 PM

bking added a parent task: T341625: Requesting permission to use kafka-main cluster to transport CirrusSearch updates.Aug 7 2023, 3:32 PM

Gehel edited parent tasks, added: T340548: [EPIC] Deployment of the Search Update Pipeline on Flink / k8s; removed: T341625: Requesting permission to use kafka-main cluster to transport CirrusSearch updates.Aug 7 2023, 6:44 PM

bking subscribed.Aug 16 2023, 2:32 PM

bking updated the task description. (Show Details)Aug 16 2023, 6:15 PM

Change 949943 had a related patch set uploaded (by MVernon; author: MVernon):

[operations/puppet@production] hiera: add swift user search_update_pipeline

https://gerrit.wikimedia.org/r/949943

Change 949944 had a related patch set uploaded (by MVernon; author: MVernon):

[labs/private@master] hiera: add fake credential for swift user search_update_pipeline

https://gerrit.wikimedia.org/r/949944

gerritbot added a project: Patch-For-Review.Aug 17 2023, 11:12 AM

Gehel added a project: Data-Platform-SRE.Aug 17 2023, 3:49 PM

Gehel moved this task from Incoming to In Progress on the Data-Platform-SRE board.Aug 18 2023, 8:32 AM

Change 949944 merged by MVernon:

[labs/private@master] hiera: add fake credential for swift user search_update_pipeline

https://gerrit.wikimedia.org/r/949944

Change 949943 merged by MVernon:

[operations/puppet@production] hiera: add swift user search_update_pipeline

https://gerrit.wikimedia.org/r/949943

MatthewVernon mentioned this in rLPRI89900f1e61c1: hiera: add fake credential for swift user search_update_pipeline.Aug 18 2023, 9:04 AM

Maintenance_bot removed a project: Patch-For-Review.Aug 18 2023, 9:10 AM

Mentioned in SAL (#wikimedia-operations) [2023-08-18T09:13:21Z] <Emperor> roll-restart thanos swift frontends to add user T342620

This is done now.

EBernhardson mentioned this in T347075: Deploy test instance of cirrus updater in k8s.Sep 27 2023, 4:40 PM

Storage request: swift s3 bucket for flink search-update-pipeline checkpointingClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Storage request: swift s3 bucket for flink search-update-pipeline checkpointing
Closed, ResolvedPublic
Actions

Related Objects
Search...