Page MenuHomePhabricator

Research technical aspects of upload speed
Closed, DeclinedPublic

Description

Realized when doing the test upload that it took about half an hour to upload the 10 test items (since they all have a lot of claims with references). Which is not really an accetable speed for a large-scale upload. Pywikibot throttle settings?

Event Timeline

pywikibots automatic throttle should have changed with the botflag which should make it a bit quicker. There is also the possibility of decreasing the throttle a bit more.

The main issue though is that each change is a separate edit. This is largely due to limitations in pywikibot.

Even now that the account has a botflag, it still took over a minute to upload an "item" with 5 claims and 2 labels to the Sandbox... Got a "sleeping for 8 seconds" after every operation.

Will do some testing with:

maxlag = 0
maxthrottle = 10
minthrottle = 0
put_throttle = 5

Tested with

maxthrottle = 5
minthrottle = 0
put_throttle = 2

It took 42 seconds to upload an item with 11 claims and 8 references. The ships are particularly claim-rich though -- most other datasets contain simpler items, which will take less time.

Tested with

maxthrottle = 5
minthrottle = 0
put_throttle = 2

It took 42 seconds to upload an item with 11 claims and 8 references. The ships are particularly claim-rich though -- most other datasets contain simpler items, which will take less time.

Which maxlag value did you use in that test?

Which maxlag value did you use in that test?

It was 0.

Which maxlag value did you use in that test?

It was 0.

I'd try bumping that to 5 which I believe is standard value but shouldn't make it slower

There is another task looking at re-factoring the pywikibot.Claim code to allow for writing claim+qualifiers+sources in one edit.

That change will probably be the only thing drastically improving the upload speed