Page MenuHomePhabricator
Feed Search

Jul 7 2016

Antigng_ created T139600: Blocked accounts should be able to generate BotPasswords.
Jul 7 2016, 12:51 PM · MediaWiki-Core-AuthManager

Jun 15 2016

Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

For the API part, I would like to add that API infrastructure (application servers and databases) is specifically prepared to be separated from non-api traffic and better ready for mass requests than regular browser queries, so that both cannot interfere each other. It produces information in nice JSON format, that you can parse with any json decoder, (or even a regex!), with little to no performance loss.

If you think that the API is non performant (both the action API or the restbase one), please send a bug and we will look at it.

We can discuss more or less usage of the API, but not using the API for API-like requests is definitely not OK. From https://www.mediawiki.org/wiki/API:Etiquette :

There is no hard and fast limit on read requests, but we ask that you be considerate and try not to take a site down. Most sysadmins reserve the right to unceremoniously block you if you do endanger the stability of their site.

Jun 15 2016, 4:06 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA
Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

BTW, the API is definitely faster, one just need to use it efficiently:

$ time curl 'https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=json&titles=January|February|March|April|May|June|July|August|September' > /dev/null
real	0m0.717s
user	0m0.004s
sys	0m0.004s

$ time (curl 'https://en.wikipedia.org/w/index.php?action=raw&title=January' && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=February' && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=March'  && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=April'  && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=May'  && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=June' && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=July' && curl 'https://en.wikipedia.org/w/index.php?action=raw&title=September' ) > /dev/null

real	0m3.654s
user	0m0.024s
sys	0m0.008s
Jun 15 2016, 4:02 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA

Jun 14 2016

Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

I don't think api.php?action=query&prop=revisions&rvprop=content can be the same performant as index.php?action=raw, and the latter is the easiest way to get the source code of a page. I would appreciate it if there was a way to perform api.php?action=raw.

Jun 14 2016, 4:22 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA
Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

Also, there doesn't exist a clear request rate limit for mediawiki api, as[[T135240| the rest api]] does. If you want to set one, you should document it.

Jun 14 2016, 4:10 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA
Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

Most of my tasks don't generate such " unacceptable amount of traffic". They usually send a few hundred to thousand requests before exit. But they still need a way to bypass the TLS redirect.

Jun 14 2016, 3:50 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA
Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

If you don't give me a good reason why cp1008.wikimedia.org:3128 / index.php?action=raw shouldn't be used, I will start some of my jobs that don't involve mass page content fetching, such as projectstat.

Jun 14 2016, 3:36 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA
Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

Labs replicas can't do that job, as revision tables are removed on such databases. Dumps are not updated such often.

Jun 14 2016, 3:42 AM · SRE, Traffic, Toolforge, Cloud-Services, DBA

Jun 13 2016

Antigng_ added a comment to T137707: Antigng-bot improper non-api http requests.

My bot was using /w/index.php?action=raw to fetch the content of each page/redirect at zhwiki, then it will do some simple search/replace/template addition work.

Jun 13 2016, 2:55 PM · SRE, Traffic, Toolforge, Cloud-Services, DBA

Jun 2 2016

Antigng_ added a comment to T135240: Enable rate limiting on pageview api .

I could reduce the concurrency by lowering the number of threads in the pool. (Current is 50.) But what if another bot task running on the same node exceeds the rate limit?

Jun 2 2016, 2:16 PM · RESTBase-API, RESTBase, Services, User-mobrovac, Analytics-Kanban
Antigng_ added a comment to T135240: Enable rate limiting on pageview api .

The rate limiting is breaking my bot.

Jun 2 2016, 3:04 AM · RESTBase-API, RESTBase, Services, User-mobrovac, Analytics-Kanban

May 2 2016

Antigng_ added a comment to T134094: Http response content is sometimes broken when gzip is not set.

As reported by User:Kanashimi, some api query output is broken, either. For example, https://zh.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content|timestamp&titles=LGBT%E7%9B%B8%E5%85%B3%E7%94%B5%E5%BD%B1%E5%88%97%E8%A1%A8&rvlimit=1&format=json&utf8 returns unnecessary "w6" at the end.

May 2 2016, 2:14 PM · TestMe, Chinese-Sites, WMF-General-or-Unknown

Apr 24 2016

Antigng_ added a comment to T123557: Database query error (internal_api_error_DBQueryError) while getting list=allrevisions.

I'm still seeing these problems:

Apr 24 2016, 1:31 AM · DBA, MediaWiki-Action-API
Antigng_ created T133474: Flow adds nowiki tags to comments containing unicode control characters .
Apr 24 2016, 1:14 AM · StructuredDiscussions