We recently found that it would be useful if the dump scripts had some sort of progress indicator, e.g. to make it easier to see in the logs of a running dump job if the dumps are being generated at a reasonable speed or how close it is to completion.
They currently only report how many entities were processed once a batch has been completed, which means logging messages like "Processed 30490 entities." multiple thousand times per job, which is not that useful.
Note: one considered way to calculate the number of batches completed vs all batch is to calculate the full number as a number of shards times number of batches per shard, and determining the number of completed batches based on the entries in the shared log file
A/C:
- Airflow logs show percent of batches done after each batch