CirrusSearch partitions its document accross multiple indices. This information is stored in the mediawiki-config and can be requested using the cirrus-dump-config API.
The preparation job should provide a component that reads this API endpoint and decorates the update document with the name of the index the ingestion job will have to write to.
The information is rarely modified, so it can be cached for a long time.
The API request should only access the mediawiki-config and thus be pretty quick and it might be acceptable to not use the AsyncIO operator and have a blocking request here.
The schema defined at https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/856507 might be adapted to include a new field to store this information.