When using the bytes option for Wikimetrics, we often need to manually convert bytes into characters. We also use rough estimate of 1 byte for latin characters, or 2-3 bytes for some other languages. We know that these multipliers are non necessarily accurate, because the amount of wikimark up changes the bytes per character in a language. For example, in hebrew, the bytes per character is probably something like 1.8, rather than 2 bytes per character.
In summary:
(a) would be great to know bytes per character for each language wiki, including wiki mark up
(b) would be great to incorporate this as an option in wikimetrics
Thanks!