Page MenuHomePhabricator

Require clients to follow our User-Agent policy
Closed, ResolvedPublic

Description

User story:
As a Product Manager of Wikidata,
I want to be able to identify what clients are using the REST API on Wikidata
in order to better understand the usage.

Problem:
Many users don't follow our User-Agent policy. The likely reason is that users don't change the default user agent strings (e.g. "python-requests/2.28.1") to include all the information that is required by the policy.

Solution:
We could check if the specified user agent is following the required format. If not the request results in error codes, as suggested in the User-Agent_policy.

Required format as per user-agent policy:

User-Agent: CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org) generic-library/0.0

The required generic format is <client name>/<version> (<contact information>) <library/framework name>/<version> [<library name>/<version> ...]. Parts that are not applicable can be omitted.

Notes:

  • We have already prohibited empty user agent strings (see T318151). \o/

Acceptance criteria:

  • If the user agent policy is not followed, the request results in the specified error response.

Open questions:

  • Alternatively, we could also think of applying throttling in case the criteria are not met.
  • What are the minimum requirements that need to be included?

Community communication:

Event Timeline

Lydia_Pintscher claimed this task.

This is happening as part of the stronger enforcement across Wikimedia wikis now.