The hdfs path /wmf/data/discovery/transfer_to_es is populated by the transfer_to_es job that can be run hourly and thus might create many folders and files. The oldest snapshot is 20200105.
It might make sense to have an automated cleanup process for this dataset, the retention data is yet to be defined (60 days?).
AC:
- /wmf/data/discovery/transfer_to_es is cleaned up regularly