Keep the last **6** snapshots of datasets stored in the following HDFS directories:
- `/user/analytics-platform-eng/structured-data/section_topics`
- `/user/analytics-platform-eng/structured-data/section-alignment-suggestions/article_images`
- `/user/analytics-platform-eng/structured-data/section-alignment-suggestions/suggestions`
- `/user/analytics-platform-eng/structured-data/seal/alignments`
- `/user/analytics-platform-eng/structured-data/seal/embeddings`
- `/user/analytics-platform-eng/structured-data/seal/features`
- `/user/analytics-platform-eng/structured-data/seal/models`
- `/user/analytics-platform-eng/structured-data/seal/sections`
`YYYY-MM-DD` sub-directories are the ones to be deleted: all of them but `seal/models/YYYY-MM-DD` contain datasets stored as parquet files. `seal/models/YYYY-MM-DD` contain pickle and CSV files.
==Exceptions==
The following paths shouldn’t be deleted until {T339129} and {T325316} are resolved:
* `/user/analytics-platform-eng/structured-data/section_topics/2022-10_ptwiki_bad`
* `/user/analytics-platform-eng/structured-data/section_topics/20230301_target_wikis_tables`
* `/user/analytics-platform-eng/structured-data/section-alignment-suggestions/aligned_sections_subset_9.0_2022-02.parquet`