Maniphest T200202

WDQS disk usage increase is correlated with reloading of categories
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Gehel
	Jul 23 2018, 2:48 PM

Description

We are getting low on disk for WDQS servers. This is being addressed in T196485. In the meantime, while looking at graphs, we see weekly increase in disk usage, at the same time as the reloadCategories.sh cron is scheduled. The increase seem to be between 5 and 20 GB each time. The latest categories .ttl.gz files are left on disk, but they are too small to explain this increase.

It looks to me like the old categories namespaces are not cleaned up.

Details

	Subject	Repo	Branch	Lines +/-
	wdqs: fix ensure of reload categories cron	operations/puppet	production	+6 -1
	wdqs: disable categories reload	operations/puppet	production	+13 -7

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Stalled	None	T57644 Eliminate overcategorization when moving images from the root category
Open	None	T113847 ErfgoedBot should not add a category if it is a subcategory of a category already there
Open	None	T110833 Provide service to filter over categorization from a list of Commons categories
Resolved	Smalyshev	T173980 Include hidden status in category RDF
Resolved	Smalyshev	T174071 [Q1 2017-18 Objective] Expand category search via WDQS
Resolved	Smalyshev	T181549 [epic] Subcategory searching
Resolved	Smalyshev	T165982 Investigate using blazegraph for deep category searching / returning of results
Resolved	Gehel	T157676 Provide access to category information from WDQS SPARQL
Resolved	Smalyshev	T173772 Create mechanism to update categories database in graph storage
Resolved	Smalyshev	T173774 Create script to dump recently changed categories
Resolved	Smalyshev	T198356 Generate daily diffs for recently changed categories
Resolved	Smalyshev	T200202 WDQS disk usage increase is correlated with reloading of categories

Event Timeline

Gehel created this task.Jul 23 2018, 2:48 PM

Restricted Application added a project: Wikidata. · View Herald TranscriptJul 23 2018, 2:48 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Looking at http://localhost:9999/bigdata/#namespaces it seems that categories namespaces are deleted. But maybe the disk space is not recovered on deletion?

It looks like there is some configuration around the release of historical data. Setting com.bigdata.service.AbstractTransactionService.minReleaseAge=1 might allow to reclaim space.

Yep, this looks like what we should be doing.

Smalyshev triaged this task as High priority.Jul 23 2018, 4:43 PM

Damn, we already set minReleaseAge=1 in RWStore.properties. We need to be looking for something else.

Change 448591 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs: disable categories reload

https://gerrit.wikimedia.org/r/448591

gerritbot added a project: Patch-For-Review.Jul 27 2018, 5:05 PM

Change 448591 merged by Gehel:
[operations/puppet@production] wdqs: disable categories reload

https://gerrit.wikimedia.org/r/448591

Change 448597 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs: fix ensure of reload categories cron

https://gerrit.wikimedia.org/r/448597

Change 448597 merged by Gehel:
[operations/puppet@production] wdqs: fix ensure of reload categories cron

https://gerrit.wikimedia.org/r/448597

Smalyshev moved this task from Incoming to Current work on the Wikidata-Query-Service board.Jul 27 2018, 8:46 PM

Generally since new categories are loaded before old ones are deleted, the space bump is expected - and Blazegraph allocates disk space in big chunks, so it can be noticeable. What is also expected is that when old categories namespace is removed, the space is freed up and then reused for the newly incoming data. However, I am not sure how to check that. There might be fragmentation or leak issues.

We probably need to look into internal Blazegraph metrics and see how actual disk memory usage vs. number of triples vs. allocated memory looks like.

Blazegraph also has space compacting tool, but it requires database shutdown and I am not sure how long it would take to process. I can experiment on that.

Also, once T198356 is implemented, we won't need to reload category namespace (at least not too often) so that issue should be eliminated.

ArielGlenn removed a project: Patch-For-Review.Aug 30 2018, 2:41 PM

Addshore moved this task from incoming to monitoring on the Wikidata board.Sep 18 2018, 2:37 PM

Does not happen anymore since we're using dailies.

WDQS disk usage increase is correlated with reloading of categoriesClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

WDQS disk usage increase is correlated with reloading of categories
Closed, ResolvedPublic
Actions

Related Objects
Search...