Page MenuHomePhabricator

[Create new error] Handle EntityContentTooBigException instead of responding with 500 Unexpected Error
Closed, ResolvedPublic5 Estimated Story Points

Description

In cases where we can, we should give the user a clear understanding of what is causing the error i.e., we should add specific error cases where possible and sensible.

Currently, users receive a 500 Unexpected error when they send an edit that exceeds the entity size limit. The entity size limit is configurable per Wikibase.

Acceptance criteria:

  • HTTP status code: 400
  • Error response:
{
  "code": "resource-too-large",
  "message": "edit resulted in a resource that exceeds the size limit of {configured-limit}",
  "context": { "limit": {configured-limit-as-int} }
}
  • Notes:
    • If we can't easily use a human readable size in the message use "{configured-limit} bytes" instead
    • If we can't easily have an integer context limit, use the string interpretation instead
    • Alternative codes considered: "result-too-large"
    • Alternative messages considered: "Edit result exceeds confiured limit {configured-limit}", "edit resulted in a resource that is too large"
    • Configure the limit number to kB.
    • This would be working for PATCH requests as well, with the same error code and response.

Event Timeline

Jakob_WMDE set the point value for this task to 5.

I'm cleaning up the tickets after our Monday meeting, and I was wondering why we thought 413 would not fit here?

Ifrahkhanyaree_WMDE renamed this task from Handle EntityContentTooBigException instead of responding with 500 Unexpected Error to [Create new error] Handle EntityContentTooBigException instead of responding with 500 Unexpected Error.Jun 25 2024, 12:27 PM

Task Breakdown Notes

Pls do not forget to modify the OpenAPI specification (add example to 400 error)

Change #1070910 had a related patch set uploaded (by Muhammad Jaziraly; author: Muhammad Jaziraly):

[mediawiki/extensions/Wikibase@master] REST: Create resource-too-large exception

https://gerrit.wikimedia.org/r/1070910

Change #1070910 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] REST: Create resource-too-large exception

https://gerrit.wikimedia.org/r/1070910

@Ifrahkhanyaree_WMDE I was testing it via an e2e test by creating many statements (5000) as @Jakob_WMDE suggested.

In theory, I think it could be tested by:
1- Adding this header X-Wikibase-CI-MAX-ENTITY-SIZE to the request headers and making its value equal to 1.
2- Sending a big request (like an item with a couple of hundred statements).

I was going in that direction at the beginning before Jakob suggested a good solution to test via test in the codebase.
So I can't tell for sure how complex this would be.

Would it be OK for @WMDE-leszek to test it programmatically to reduce the efforts as much as possible?

Is it worth the effort to implement new things in the task just for testing purposes? I thought of adding some kind of a flag to a request, or creating a new testing endpoint just to check if it's working, but for our case, it doesn't seem okay to have such things shipped to production. What would be alternative solutions?

@Ifrahkhanyaree_WMDE I've created a very large item with lots of statements for you to test the resource-too-large case. Below is a screenshot of me trying to add another statement to the item. You should get the same result for any type of edit that would make the item bigger.

image.png (460×1 px, 64 KB)

Here is the code I used to create the large item:

1import os
2import random
3import string
4import time
5
6import requests
7
8REST_BASE = 'https://wikidata.beta.wmflabs.org/w/rest.php'
9BASE_URL = f'{REST_BASE}/wikibase/v0'
10
11PREDICATE_PROPERTY_ID = 'P249598'
12QUALIFIER_PROPERTY_ID = 'P216060'
13REFERENCE_PROPERTY_ID = 'P230880'
14
15headers = {
16 'User-Agent': 'WMDE WPP create large item script',
17 'Authorization': f'Bearer {os.environ["BEARER_TOKEN"]}'
18}
19
20
21def create_unique_string(prefix: str = '') -> str:
22 unique = ''.join(random.choices(string.ascii_letters, k=10))
23 return f'{prefix} {unique}'.strip()
24
25
26def create_random_property_value_pair(property_id: str) -> dict:
27 return {
28 'property': {
29 'id': property_id,
30 },
31 'value': {
32 'type': 'value',
33 'content': create_unique_string('random string value'),
34 }
35 }
36
37
38def create_random_statement() -> dict:
39 return {
40 **create_random_property_value_pair(PREDICATE_PROPERTY_ID),
41 'qualifiers': [create_random_property_value_pair(QUALIFIER_PROPERTY_ID) for _ in range(5)],
42 'references': [{'parts': [create_random_property_value_pair(REFERENCE_PROPERTY_ID) for _ in range(5)]} for _ in range(5)],
43 }
44
45
46# create an item with a single statement
47# body = {
48# "item": {
49# "labels": {"en": create_unique_string('large item')},
50# 'statements': {PREDICATE_PROPERTY_ID: [create_random_statement()]},
51# }
52# }
53# response = requests.request("POST", f"{BASE_URL}/entities/items", headers=headers, json=body)
54#
55# if response.status_code > 201:
56# print('Error creating item')
57# print('Status code:', response.status_code)
58# print('Headers:', response.headers)
59# print('Body:', response.text)
60# exit()
61#
62# item_id = response.json()['id']
63item_id = 'Q630399'
64
65statements = {}
66batch_size = 50
67while batch_size > 0:
68 start = time.time()
69 body = {
70 'patch': [
71 {'op': 'add', 'path': f'/statements/{PREDICATE_PROPERTY_ID}/-', 'value': create_random_statement()}
72 for _ in range(batch_size)
73 ]
74 }
75 response = requests.request("PATCH", f"{BASE_URL}/entities/items/{item_id}", headers=headers, json=body)
76 end = time.time()
77
78 if response.status_code == 400 and response.json()['code'] == 'resource-too-large':
79 batch_size = batch_size // 2
80 continue
81
82 # if response.status_code == 429:
83 # TODO: handle rate limiting?
84
85 if response.status_code != 200:
86 print('Error patching item')
87 print('Status code:', response.status_code)
88 print('Headers:', response.headers)
89 print('Body:', response.text)
90 exit()
91
92 statements = response.json()['statements'][PREDICATE_PROPERTY_ID]
93 print(f'Added {batch_size} statements to {item_id} in {end-start} seconds. Total statements: {len(statements)}')
94
95print(f"created large item with {len(statements)} statements")