Description
In working a bit with wmfdata-python I just found the function df_to_remarkup. I had already written another version of this function for myself without realizing it was in wmfdata/utils.py, and upon checking out the one within the package I'm thinking that we could do a better job of formatting the output by directly using pandas.DataFrame.to_markdown and doing a minor edit of the output.
Here's a basic example:
wmf.utils.df_to_remarkup(df_rest_api_http_status)
... results in:
| http_status | total_requests | percent_of_total | ----- | ----- | ----- | 200 | 119941 | 96.5 | 404 | 2507 | 2.0 | 304 | 1231 | 1.0 | 308 | 307 | 0.2 | 429 | 227 | 0.2 | 400 | 126 | 0.1
What I'd written before finding df_to_remarkup is:
print(df_rest_api_http_status.to_markdown(index=False, tablefmt="pipe").replace(":", "-"))
... that results in:
| http_status | total_requests | percent_of_total | |---------------|------------------|--------------------| | 200 | 119941 | 96.5 | | 404 | 2507 | 2 | | 304 | 1231 | 1 | | 308 | 307 | 0.2 | | 429 | 227 | 0.2 | | 400 | 126 | 0.1 |
The .replace(":", "-") fixes the header orientation formatting from |--------------:|-----------------:|-------------------:|, with this incorrectly adding a row for the hyphens/bars/colons below the header in Remarkup and further does not format the header with a different background color. I like that 2.0 and 1.0 are maintained in df_to_remarkup, and would implement a change that allows for integer floats to maintain their decimal place. Specifically pandas is already loaded into wmfdata/utils.py, so we could potentially use the above code and a minor edit to maintain floats for the task's work. The resulting code would also be more concise.
Contribution
I'd work on this myself if it's deemed to be something that we'd want to implement. I'm seeing there are a couple of things for the documentation that I could also send along in a separate PR :) Obviously this isn't a major change as it's all getting copy-pasted for the same result, but looking at a nicely formatted markup table's always nicer than a poorly formatted one in my opinion 😇
Let me know if there's something else I should do for future wmfdata-python issues!