Page MenuHomePhabricator

Increase quota for wikiqlever
Closed, ResolvedPublic

Description

Project Name: wikiqlever
Type of quota increase requested: cpu/ram/disk/instance count
Amount to increase: +26 cpus/ +68 GB / +2TB disk / +4
Reason: A single 32GB instance is not sufficient to handle the load; we switch to a k8s deployment model for the Qlever backend. Therefore, we would like to have 2 additional machines (g4.cores8.ram32.disk20) similar to the current instance qlever1. In addition, we want to separate the frontend (only for demo and testing) to a separate node g4.cores1.ram2.disk20. Moreover, we would like to have 1 additional node g4.cores1.ram2.disk20 as a node manager.

This request is informed by the setup that @Hannah_Bast and her team are using for running their QLever endpoint for Wikidata: Ryzen 9 processor (16 cores = 32 threads), 128 GB RAM and at least 4 TB disk space (NVMe SSDs).

See also T413097

Event Timeline

Physikerwelt renamed this task from Increase quoata for wikiqlever to Increase quota for wikiqlever.Mon, Jan 19, 4:26 PM
Physikerwelt updated the task description. (Show Details)

+1 on my end

Mostly out of curiosity, would you mind expanding on the horizontal scaling story and what are the expected benefits of k8s here? My (limited) understanding of qlever is that it scales vertically, not horizontally, will some sort of sharding scheme be employed for example?

Mostly out of curiosity, would you mind expanding on the horizontal scaling story and what are the expected benefits of k8s here? My (limited) understanding of qlever is that it scales vertically, not horizontally, will some sort of sharding scheme be employed for example?

While one giant VM would be the usual approach for qlever, I got the impression from discussions with @Andrew (and previous experience) that it might cause problems during vm migrations. (At a university cluster we indeed had problems over and again with our 128GB instance when openstack was updated). In some situations it caused days of downtime and manual invention by the admins. Therefore, we want to use several smaller vms (that are all identical and have a full up-to-date copy of all of wikidata via the eventbus stream). When a request comes in k8s task is to decide which of the identical qlever backends should answer the request. (in other situation we have used this approach and achieved a very good balance by doing so).

Understood, thank you @Physikerwelt for the explanation

+1 this is just fine if the project winds up succeeding and having users. Please try to fail fast and let us know if you wind up not using things.

+1 this is just fine if the project winds up succeeding and having users. Please try to fail fast and let us know if you wind up not using things.

This will be the backend for https://people.compute.dtu.dk/faan/scholia-page-view-statistics.html until WMF comes up with a replacement for an unsplit WDQS.

A single 32GB instance is not sufficient to handle the load

What data do you have to believe that the proposed quota increase is enough to handle the expected load?

A single 32GB instance is not sufficient to handle the load

What data do you have to believe that the proposed quota increase is enough to handle the expected load?

Good question. @Daniel_Mietchen is in a better position to answer. He has set up https://scholia-qlever.toolforge.org which points to the current wmcloud instance qlever1. All individual queries can be executed, however running several queries simultaneously is sometimes slow. An non wmf hosted experiment https://qlever.scholia.wiki/ that points to

Ryzen 9 processor (16 cores = 32 threads), 128 GB RAM and at least 4 TB disk space (NVMe SSDs).

works more fluent. From those experiments our feeling is that we should go with a scalable k8s setup seems a reasonable target infrastructure and 3 nodes (of the biggest flavor) seems a reasonable start for running detailed performance tests. If we run into limits we would file a new request.

Mentioned in SAL (#wikimedia-cloud-feed) [2026-01-26T16:56:44Z] <komla@cloudcumin1001> START - Cookbook wmcs.openstack.quota_increase by 26 cores, 2000 gigabytes, 4 instances, 69632 ram (T414983)

Mentioned in SAL (#wikimedia-cloud-feed) [2026-01-26T16:56:51Z] <komla@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) by 26 cores, 2000 gigabytes, 4 instances, 69632 ram (T414983)

komla claimed this task.