The preparation job should discover what index to write to
Closed, ResolvedPublic3 Estimated Story Points
Actions

Assigned To

Authored By

	dcausse
	Nov 21 2022, 3:59 PM

Description

CirrusSearch partitions its document accross multiple indices. This information is stored in the mediawiki-config and can be requested using the cirrus-dump-config API.

The preparation job should provide a component that reads this API endpoint and decorates the update document with the name of the index the ingestion job will have to write to.

The information is rarely modified, so it can be cached for a long time.
The API request should only access the mediawiki-config and thus be pretty quick and it might be acceptable to not use the AsyncIO operator and have a blocking request here.
The schema defined at https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/856507 might be adapted to include a new field to store this information.

Details

Subject	Repo	Branch	Lines +/-
Add index_name in the metadata of the cirrus build doc API	mediawiki/extensions/CirrusSearch	master	+8 -0
Propagate the index_name from UpdateRowEncoder	search/cirrus-streaming-updater	master	+9 -0
Add CirrusNamespaceIndexMap	search/cirrus-streaming-updater	master	+544 -60
Add CirrusSearchConcreteReplicaGroup to the config-dump API	mediawiki/extensions/CirrusSearch	master	+34 -1

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T317045 [Epic] Re-architect the Search Update Pipeline
		Resolved		dcausse	T323508 The preparation job should discover what index to write to

Event Timeline

dcausse created this task.Nov 21 2022, 3:59 PM

Restricted Application added a project: Discovery-Search. · View Herald TranscriptNov 21 2022, 3:59 PM

Gehel triaged this task as High priority.Nov 21 2022, 4:25 PM

Gehel moved this task from needs triage to Current work on the Discovery-Search board.

Gehel edited projects, added Discovery-Search (Current work); removed Discovery-Search.

TJones updated the task description. (Show Details)Nov 21 2022, 4:56 PM

Gehel set the point value for this task to 3.Nov 21 2022, 4:58 PM

Gehel moved this task from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.

dcausse claimed this task.Nov 23 2022, 3:49 PM

dcausse moved this task from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.

Change 860607 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/CirrusSearch@master] Add CirrusSearchConcreteReplicaGroup to the config-dump API

https://gerrit.wikimedia.org/r/860607

gerritbot added a project: Patch-For-Review.Nov 24 2022, 3:33 PM

Change 860927 had a related patch set uploaded (by DCausse; author: DCausse):

[search/cirrus-streaming-updater@master] [WIP] Add CirrusNamespaceIndexMap

https://gerrit.wikimedia.org/r/860927

Change 860607 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Add CirrusSearchConcreteReplicaGroup to the config-dump API

https://gerrit.wikimedia.org/r/860607

ReleaseTaggerBot added a project: MW-1.40-notes (1.40.0-wmf.12; 2022-11-28).Nov 28 2022, 7:01 PM

Change 867542 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/CirrusSearch@master] Add index_name in the metadata of the cirrus build doc API

https://gerrit.wikimedia.org/r/867542

Change 868039 had a related patch set uploaded (by DCausse; author: DCausse):

[search/cirrus-streaming-updater@master] Propagate the index_name from UpdateRowEncoder

https://gerrit.wikimedia.org/r/868039

dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board.Dec 14 2022, 11:04 AM

Change 860927 merged by jenkins-bot:

[search/cirrus-streaming-updater@master] Add CirrusNamespaceIndexMap

https://gerrit.wikimedia.org/r/860927

Change 868039 merged by jenkins-bot:

[search/cirrus-streaming-updater@master] Propagate the index_name from UpdateRowEncoder

https://gerrit.wikimedia.org/r/868039

Change 867542 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Add index_name in the metadata of the cirrus build doc API

https://gerrit.wikimedia.org/r/867542

ReleaseTaggerBot edited projects, added MW-1.40-notes (1.40.0-wmf.18; 2023-01-09); removed MW-1.40-notes (1.40.0-wmf.12; 2022-11-28).Jan 5 2023, 5:00 PM

dcausse moved this task from Needs review to To Be Deployed on the Discovery-Search (Current work) board.Jan 9 2023, 4:04 PM

dcausse moved this task from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board.Jan 30 2023, 4:15 PM

Maintenance_bot removed a project: Patch-For-Review.Jan 30 2023, 4:31 PM

Gehel closed this task as Resolved.Feb 10 2023, 4:15 PM

The preparation job should discover what index to write toClosed, ResolvedPublic3 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

The preparation job should discover what index to write to
Closed, ResolvedPublic3 Estimated Story Points
Actions

Related Objects
Search...