Design and code a light UI for the scraper. Show parallel cursors moving through 0-100% of source files as a stack of progress bars. This view rolls up concurrent workers, eg. using the checkpoint counter to show overall progress, but partitions on wiki.
Print a count of rows processed, and the percentage of total lines.
Screencast demonstrating the owl library (sample code, asciicinema):
Efficient implementation depends on a missing dump stat "number of pages in Main namespace", see subtask T332858: Enterprise HTML dump stats should include file size and article count.
Code to review:
https://gitlab.com/wmde/technical-wishes/scrape-wiki-html-dump/-/merge_requests/51