Page MenuHomePhabricator

Create CQS puppet configs by applying query_service module
Closed, ResolvedPublic

Description

Create puppet configs for CQS. This will be applied to CQS when the servers are available and racked.

Event Timeline

@Igorkim78 can you document the config changes here?

The configuration changes for SDC data are as follows. Note that namespace 'sdc' is used to store RDF data in blazegraph journal, might be changed as needed. It is not recommended to keep the namespace the same as for Wikidata (wdq), as it might result in conflicts while deploying the services on shared server (if such configuration will be implemented) and also might result in addressing the wrong namespace in the Blazegraph journal returning improper data for the queries.

  • Blazegraph journal config (RWStore.properties)

replace the similar configuration for WDQS (search for com.bigdata.namespace.wdq prefix for the parameters to be replaced):

# Bump up the branching factor for the lexicon indices on the default kb.
com.bigdata.namespace.sdc.lex.BLOBS.com.bigdata.btree.BTree.branchingFactor=400
com.bigdata.namespace.sdc.lex.ID2TERM.com.bigdata.btree.BTree.branchingFactor=599
com.bigdata.namespace.sdc.lex.TERM2ID.com.bigdata.btree.BTree.branchingFactor=300
# Bump up the branching factor for the statement indices on the default kb.
com.bigdata.namespace.sdc.spo.JUST.com.bigdata.btree.BTree.branchingFactor=1024
com.bigdata.namespace.sdc.spo.OSP.com.bigdata.btree.BTree.branchingFactor=866
com.bigdata.namespace.sdc.spo.POS.com.bigdata.btree.BTree.branchingFactor=954
com.bigdata.namespace.sdc.spo.SPO.com.bigdata.btree.BTree.branchingFactor=934

Note, that the final configuration should be adjusted for the real production data according to instructions in T232768.

  • Scripts to run Updater should be called with proper namespace:

On data load:

./loadRestAPI.sh -n wdq -d `pwd`/data/split

replace by

./loadRestAPI.sh -n sdc -d `pwd`/data/split

On single file load:

./loadRestAPI.sh -n wdq -d `pwd`/data/split/wikidump-000000001.ttl.gz

replace by

./loadRestAPI.sh -n sdc -d `pwd`/data/split/wikidump-000000001.ttl.gz

On run updater:

./runUpdate.sh -n wdq

replace by

./runUpdate.sh -n sdc

On any calls to Blazegraph REST, instead of

http://localhost:9999/bigdata/namespace/wdq/sparql

use

http://localhost:9999/bigdata/namespace/sdc/sparql

Categories store might need similar changes, but that has to be discussed, if separate categories are needed for production SDC data.

Matt's initial work has gotten us most of the way there. In reviewing whats available now, and booting a test instance to see if it can fully setup a new instance from scratch (hint: no).

Background info

  • The current sdcquery instance applies the role::wdqs::labs.
  • There is a tiny amount of instance specific puppet config in horizon
  • Most puppet config is in the main puppet repo at hieradata/cloud/eqiad1/wikidata-query/common.yaml. This config is for a wikidata query service though, not the SDoC query service.

Issues to address

  • Logging is configured to ship to deployment-logstash2, but logstash-beta.wmflabs.org doesn't report any logs from the existing instance or the one i recently booted.
  • The instance is also configured with the beta cluster eventgate endpoint (for request logging), once an instance is running we can verify correct operation. Even if we get the events flowing, if we want to do anything with them followup will be required.
  • Puppet currently includes the updater and categories, afaik we want those disabled for the new instance
  • Brand new instances currently don't complete the puppet run due to trying to clone wdqs repo into a parent directory that doesn't exist
  • Once the instance was up it only responded for a few minutes after which jetty reports Service Unavailable on /bigdata/. Unclear yet what causes this.
  • Current sdcquery sets use_deployed_config=true which means puppet doesn't control the blazegraph configuration, it's whatever happens to be on the machine.
  • Currently profile::query_service::blazegraph installs the primary blazegraph instance, but profile::query_service::categories installs the categories specific blazegraph instance. It's unclear if sdoc should have a new role and new profile, or if a new sdoc role should point at the current blazegraph profile. There are significant amounts of duplication between blazegraph and cateogies query_service profiles, but puppet doesn't make it super elegant to put all these things together generically and still understand what the differences are.

Probably more, but this is an initial look through. I'm going to put together a patch to address some of the above and get the instance starting from a cold boot, but there are some open questions above that could use any insight others might have.

Change 595041 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[operations/puppet@production] Role for SDoC WDQS

https://gerrit.wikimedia.org/r/595041

Instructions for booting a new instance. Currently this requires pointing the instance at a puppetmaster in the wikidata-query project.

  1. Start new instance in horizon. Use debian-stretch and m1.large
  2. Set the puppetmaster to wsqspuppet.wikidata-query.eqiad.wmflabs
  3. Apply hiera first, then roles
  4. Run sudo puppet agent -tv to apply the change now (or wait and it will happen eventually).

hiera:

profile::query_service::blazegraph_heap_size: 6g
profile::query_service::blazegraph_use_deployed_config: false
profile::query_service::data_dir: /srv/wdqs-data
profile::query_service::forward_rsyslog_host: deployment-logstash03.deployment-prep.eqiad.wmflabs
profile::query_service::load_categories: none
profile::query_service::package_dir: /srv/wdqs-package
puppetmaster: wdqspuppet.wikidata-query.eqiad.wmflabs

roles;

role::labs::lvm::srv
role::wdqs::sdoc

Change 595041 merged by Gehel:
[operations/puppet@production] Role for SDoC WDQS

https://gerrit.wikimedia.org/r/595041

Change 602102 had a related patch set uploaded (by Gehel; owner: EBernhardson):
[operations/puppet@production] Role for SDoC WDQS

https://gerrit.wikimedia.org/r/602102

Change 602102 abandoned by Gehel:
Role for SDoC WDQS

Reason:
replaced by I11763c3ebbfa21e958a5933573eef627b134e573

https://gerrit.wikimedia.org/r/602102