User story:
As a Product Manager of Wikidata,
I want to be able to identify what clients are using the REST API on Wikidata
in order to better understand the usage.
Problem:
Many users don't follow our User-Agent policy. The likely reason is that users don't change the default user agent strings (e.g. "python-requests/2.28.1") to include all the information that is required by the policy.
Solution:
We could check if the specified user agent is following the required format. If not the request results in error codes, as suggested in the User-Agent_policy.
Required format as per user-agent policy:
User-Agent: CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org) generic-library/0.0
The required generic format is <client name>/<version> (<contact information>) <library/framework name>/<version> [<library name>/<version> ...]. Parts that are not applicable can be omitted.
Notes:
- We have already prohibited empty user agent strings (see T318151). \o/
Acceptance criteria:
- If the user agent policy is not followed, the request results in the specified error response.
Open questions:
- Alternatively, we could also think of applying throttling in case the criteria are not met.
- What are the minimum requirements that need to be included?
Community communication: