User Details
- User Since
- Sep 5 2023, 11:23 AM (118 w, 3 d)
- Availability
- Available
- IRC Nick
- brouberol
- LDAP User
- Brouberol
- MediaWiki User
- BRouberol-WMF [ Global Accounts ]
Today
We've deployed OIDC login for growthbook.wikimedia.org, but it uses idp-test instead of idp at the moment, because cas.authn.oidc.id-token.include-id-token-claims is still set to false in idp.
Yesterday
brouberol@deploy2002:~$ airflow-devenv create --dags-folder ml --branch plop Usage: airflow-devenv create [OPTIONS] Try 'airflow-devenv create --help' for help.
Wed, Dec 10
Could we set a connection to a Spark Thrift Server?
I don't think we have a Spark Thrift Server running anywhere in the an-* hosts anywhere. We can possibly query Presto instead?
I've gotten so far as running the following query:
from airflow.decorators import task from airflow.models.taskinstance import TaskInstance from airflow.operators.empty import EmptyOperator from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator from airflow.utils.task_group import TaskGroup
Tue, Dec 9
Hm, I wonder if the choice of wmf.edit_hourly was intentional
Very much not so. I picked a random table in the Superset SQL Lab view!
https://growthbook-next.wikimedia.org is now OIDC-authenticated.
After having studied the CAS documentation and codebase, @SLyngshede-WMF notified me that he felt confident flipping that cas.authn.oidc.id-token.include-id-token-claims flag to true. We'll proceed on idp-test / growtbook-next first.
Reporting here after many (mostly failing) experimentations to integrate Growtbook with CAS. TLDR: the only way we have managed to make it work is by flipping cas.authn.oidc.id-token.include-id-token-claims to true globally.
Fri, Dec 5
Thu, Dec 4
One thing we need to figure out (cc @BTullis @mpopov) is what LDAP group(s) should we require the human to be part of to get entry to growthbook.
- wmf?
- nda ?
- another specific group with specific people who can access, _à la_ https://ldap.toolforge.org/group/spiderpig-access ?
Tue, Dec 2
Anytime!
I've switched leadership for the codfw.mediawiki.job.htmlCacheUpdate topic.
One thing to note: we're seeing a lot of traffic already coming out of kafka-main2010, and the rebalancing will add new partitions to the broker (not necessary leaders, but still). If the resulting traffic / broker is too skewed and that kafka-main2010 is sending out too much data compared to other brokers., we can set it as a replica instead of a leader of the largest topic in the cluster:
brouberol@kafka-main2008:~/T407185$ kafka topics --describe --topic codfw.mediawiki.job.htmlCacheUpdate kafka-topics --zookeeper conf2004.codfw.wmnet,conf2005.codfw.wmnet,conf2006.codfw.wmnet/kafka/main-codfw --describe --topic codfw.mediawiki.job.htmlCacheUpdate Topic:codfw.mediawiki.job.htmlCacheUpdate PartitionCount:1 ReplicationFactor:3 Configs: Topic: codfw.mediawiki.job.htmlCacheUpdate Partition: 0 Leader: 2005 Replicas: 2005,2003,2001 Isr: 2001,2003,2005
brouberol@kafka-main2008:~/T407185$ kafka reassign-partitions kafka reassign-partitions --reassignment-json-file ./rebalancing.json --execute --throttle 30000000 kafka-reassign-partitions --zookeeper conf2004.codfw.wmnet,conf2005.codfw.wmnet,conf2006.codfw.wmnet/kafka/main-codfw kafka reassign-partitions --reassignment-json-file ./rebalancing.json --execute --throttle 30000000 Current partition replica assignment
Mon, Dec 1
I've generated a rebalancing plan on kafka-main2008 using
brouberol@kafka-main2008:~/T407185$ topicmappr rebalance --topics '.*' --brokers -2 --out-file rebalancing --storage-threshold-gb 1000
Fri, Nov 28
Thu, Nov 27
I'm going to experiment with something akin to
Wed, Nov 26
Oh, you're right!
https://growthbook-next.wikimedia.org is now up and running. It is connected to the analytics-test-presto cluster.
Naïve q, piggybacking on @Eevans 's response: what about a DNS domain resolving to the node IPs? If we have a recent enough version, we can let the client perform the DNS resolution and trial of the different resolved node IPs, as per https://issues.apache.org/jira/browse/CASSANDRA-14361
Tue, Nov 25
Serets have been provisioned. I'll just need to update the DB password once the DB itself is provisioned.
brouberol@krb1002:~$ sudo kadmin.local addprinc -randkey HTTP/growthbook-next.discovery.wmnet@WIKIMEDIA brouberol@krb1002:~$ sudo kadmin.local addprinc -randkey growthbook/growthbook-next.discovery.wmnet@WIKIMEDIA brouberol@krb1002:~$ sudo kadmin.local ktadd -norandkey -k growthbook-next-backend.keytab HTTP/growthbook-next.discovery.wmnet@WIKIMEDIA growthbook/growthbook-next.discovery.wmnet@WIKIMEDIA Entry for principal HTTP/growthbook-next.discovery.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:growthbook-next-backend.keytab. Entry for principal growthbook/growthbook-next.discovery.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:growthbook-next-backend.keytab.
brouberol@stat1008:~$ s3cmd --access_key=$access_key --secret_key=$secret_key --host=rgw.eqiad.dpe.anycast.wmnet --region=dpe --host-bucket=no mb s3://postgresql-growthbook-next.dse-k8s-eqiad Bucket 's3://postgresql-growthbook-next.dse-k8s-eqiad/' created
brouberol@cephosd1001:~$ sudo radosgw-admin user create --uid=postgresql-growthbook-next --display-name="postgresql-growthbook-next"
{
"user_id": "postgresql-growthbook-next",
"display_name": "postgresql-growthbook-next",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [],
"keys": [
{
"user": "postgresql-growthbook-next",
"access_key": "[REDACTED]",
"secret_key": "[REDACTED]"
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}



