I was playing with the extracts API today, in preparation for the Multimedia team's work for next quarter, and I spent about 30 minutes trying to debug TextExtracts and core, because I overlooked the fine print on the API documentation that explains TextExtracts will only return one result by default.
As far as I can tell, finding the extracts is a reasonably performant process, and it won't crash the servers to fetch results for a few more pages in the set. However, I'd be interested to know the rationale for the low default.
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| Increase default API limit from 1 to 20 | mediawiki/extensions/TextExtracts | master | +50 -1 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | Jdlrobson | T164010 [EPIC] Strengthen the APIs we provide in reading web maintained extensions | |||
| Resolved | Jdlrobson | T153707 Increase exlimit to a number bigger than 1 |
Event Timeline
Why it was done: producing extracts requires parser output, so a full parse can happen if you request full-page extracts. One parse of a large page per request is more than enough. Requesting intro-only extracts with exintro doesn't have this problem.
@MarkTraceur how many extracts do you need to retrieve as max? We're trying to define some hard limits.
@MaxSem That makes sense, so maybe we should close this task. Can't api users increase the exlimit and overload the api though?
A default of 1 doesn't seem very useful.
We talked about this and concluded we can't stop users doing this anyway so let's have a more sensible default for when a user requests more than 1 page.
Change 354482 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[mediawiki/extensions/TextExtracts@master] Increase default API limit from 1 to 20
Change 354482 merged by jenkins-bot:
[mediawiki/extensions/TextExtracts@master] Increase default API limit from 1 to 20