wb_terms with 1.4B rows (with average of 140 bytes per row) is one of the biggest tables on wikidata.org that needs clean up in several areas:
- Drop term_entity_id (and maybe term_entity_type) in favor of term_full_entity_id. We just switched to reading from term_full_entity_id everywhere.
- Normalize term_entity_type (item=0, property=1, etc.) if we are not dropping it. It's better to be a settings file
- Normalize term_type (label=0, etc.). It's better to be a settings file
- Normalize term_lang (en=0, en-gb =1, etc.) in either another table or a settings file
How we are going to normalize (where to store the mapping) and migration plan needs to be determined.