Page MenuHomePhabricator

Sqoop on multi-instance clouddb1021 is very slow for some tables
Closed, DeclinedPublic

Description

Getting data out of what was labsdb1012 (now multi-instance clouddb1021) is slower, up to very slower for some cases.
Example: wikidatawiki.revision took ~1h30 last month, 4h50 this month.

For the moment the important is running so there is nothing we can do, let's discuss when import is done.

Event Timeline

We can try to give wikidata (s8) more memory and remove it from some other sections, ie (from s5 and s6)
That table itself is 230GB which will never fit in the buffer pool though

JAllemandou added a subscriber: Milimetric.

Thanks for your suggestion @Marostegui.
The global drift is not big (this month took 4h more than the previous one, less than 10% increase overall).
As discussed with @Milimetric there would be multiple options to try to make the overall process faster, but we are not going to prioritize this for now.
Let's close and reopen if needed.
Many thanks.