The enterprise structured content snapshots will be the source of the semantic search MVP.
We want to refresh the indices weekly and for this we need to download such dumps at this same rate.
We are going to join these dumps with the main search index dumps done on Sundays.
Related: T403298
Related: import_enterprise_dumps.py
Preliminary work: https://gitlab.wikimedia.org/repos/search-platform/discolytics/-/commit/c5925b4e6825fc9c2bf400e08d8c44fd55e3ab26
AC:
- structured content snapshots are available in hdfs
- updated weekly on Sundays