In preparation for multilingual "depicts" statements on media as part of Structured Data on Commons, we need to asses and quantify problems with labels for Wikidata Q items being inconsistent or incomplete. This is especially problematic with animals and plants because of inconsistencies with taxon names vs. Common names.
The problem in more detail
See this Google Presentation deck for a more detailed list of examples: https://docs.google.com/presentation/d/1dANu94y9AA16t6fN_i_hagqlsdI92qMmezhIb14F-bo/
What we need
a.) Data analysis for generating reports, analysis, and/or research into quantifying the scale of the problem (how many concepts are affected, whether it's primarily a problem with taxons or other types of data, etc).
This is hard problem, and perhaps somewhat abstract, so we'll defer to the expertise of the data expertise on what is feasible.
b.) Assistance with developing potential solutions to the data problem.
Our need here is important but not time-sensitive at the moment, so having at least a first round of analysis by mid-August would suffice.