Page MenuHomePhabricator

RKemper (Ryan Kemper)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
May 1 2020, 10:28 PM (215 w, 6 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
RKemper (WMF) [ Global Accounts ]

Recent Activity

Yesterday

RKemper moved T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs from Backlog to In Progress on the Data-Platform-SRE (2024.06.17 - 2024.07.07) board.
Thu, Jun 20, 6:49 PM · Patch-For-Review, Data-Platform-SRE (2024.06.17 - 2024.07.07), Discovery-Search (Current work), Wikidata
RKemper added a project to T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs: Data-Platform-SRE (2024.06.17 - 2024.07.07).
Thu, Jun 20, 6:49 PM · Patch-For-Review, Data-Platform-SRE (2024.06.17 - 2024.07.07), Discovery-Search (Current work), Wikidata
RKemper claimed T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs.
Thu, Jun 20, 6:48 PM · Patch-For-Review, Data-Platform-SRE (2024.06.17 - 2024.07.07), Discovery-Search (Current work), Wikidata
RKemper added a comment to T367442: hw troubleshooting: Multi-bit errors on DIMM_B1 for an-worker1085.eqiad.wmnet.

Host has been downtimed. Accidentally associated to wrong ticket: https://phabricator.wikimedia.org/T367825#9908323

Thu, Jun 20, 4:53 PM · SRE, ops-eqiad, DC-Ops
RKemper added a subtask for T364363: [Epic] Productionize federated wdqs graph-split endpoints: T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs.
Thu, Jun 20, 4:03 PM · Data-Platform-SRE, Discovery-Search, Epic, Wikidata-Query-Service, Wikidata
RKemper added a parent task for T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs: T364363: [Epic] Productionize federated wdqs graph-split endpoints.
Thu, Jun 20, 4:03 PM · Patch-For-Review, Data-Platform-SRE (2024.06.17 - 2024.07.07), Discovery-Search (Current work), Wikidata

Tue, Jun 18

RKemper added a comment to T367442: hw troubleshooting: Multi-bit errors on DIMM_B1 for an-worker1085.eqiad.wmnet.

Hey @RKemper would Thursday work for you? Around 12:00 EST?

Tue, Jun 18, 3:25 PM · SRE, ops-eqiad, DC-Ops

Mon, Jun 17

RKemper updated the task description for T367825: hw troubleshooting: Multi-bit errors on DIMM_A2 for an-worker1093.
Mon, Jun 17, 10:08 PM · SRE, ops-eqiad, DC-Ops
RKemper updated the task description for T367825: hw troubleshooting: Multi-bit errors on DIMM_A2 for an-worker1093.
Mon, Jun 17, 10:08 PM · SRE, ops-eqiad, DC-Ops
RKemper created T367825: hw troubleshooting: Multi-bit errors on DIMM_A2 for an-worker1093.
Mon, Jun 17, 10:05 PM · SRE, ops-eqiad, DC-Ops
RKemper added a comment to T367442: hw troubleshooting: Multi-bit errors on DIMM_B1 for an-worker1085.eqiad.wmnet.

@RKemper When is there a preference on when we could schedule this?

Mon, Jun 17, 9:59 PM · SRE, ops-eqiad, DC-Ops
RKemper updated the task description for T367592: hadoop rolling reboot cookbook: add start-datetime flag.
Mon, Jun 17, 9:44 PM · Patch-For-Review, Data-Platform-SRE (2024.06.17 - 2024.07.07)

Fri, Jun 14

RKemper created T367592: hadoop rolling reboot cookbook: add start-datetime flag.
Fri, Jun 14, 6:54 PM · Patch-For-Review, Data-Platform-SRE (2024.06.17 - 2024.07.07)

Thu, Jun 13

RKemper updated the task description for T367442: hw troubleshooting: Multi-bit errors on DIMM_B1 for an-worker1085.eqiad.wmnet.
Thu, Jun 13, 4:25 PM · SRE, ops-eqiad, DC-Ops
RKemper created T367442: hw troubleshooting: Multi-bit errors on DIMM_B1 for an-worker1085.eqiad.wmnet.
Thu, Jun 13, 4:25 PM · SRE, ops-eqiad, DC-Ops

Wed, Jun 12

RKemper moved T364861: Check SLO impact of Elastic cluster rolling restarts/mitigate if necessary from In Progress to Done on the Data-Platform-SRE (2024.05.27 - 2024.06.16) board.

Placeholder added: https://wikitech.wikimedia.org/wiki/SLO/Search#Service_Level_Indicators_(SLIs)

Wed, Jun 12, 3:26 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16), Discovery-Search

Tue, Jun 4

RKemper added a comment to P64016 Testing wdqs.data-reload with HDFS.
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 250, in _run
    raw_ret = runner.run()
  File "/srv/deployment/spicerack/cookbooks/sre/wdqs/data-reload.py", line 268, in run
    self.preparation_step.run()
  File "/srv/deployment/spicerack/cookbooks/sre/wdqs/data-reload.py", line 458, in run
    self._extract_from_hdfs(tmpdir)
  File "/srv/deployment/spicerack/cookbooks/sre/wdqs/data-reload.py", line 415, in _extract_from_hdfs
    size = self._get_dump_size_from_hdfs()
  File "/srv/deployment/spicerack/cookbooks/sre/wdqs/data-reload.py", line 408, in _get_dump_size_from_hdfs
    return int(re.sub(r"^(\d+)\s+.*$", next(lines), r"\1"))
ValueError: invalid literal for int() with base 10: '\\1'
Tue, Jun 4, 9:28 PM

Tue, May 28

RKemper added a comment to T364861: Check SLO impact of Elastic cluster rolling restarts/mitigate if necessary.

Confirm or deny that we do plan on setting an availability SLO target around the metric "more than 0.1% of requests to Elastic from the MW app servers fail over a 5m period".

Tue, May 28, 8:27 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16), Discovery-Search

Thu, May 23

RKemper updated the task description for T365735: Consider creating a separate WDQS server type for categories.
Thu, May 23, 4:32 PM · Wikidata, Data-Platform-SRE, Wikidata-Query-Service

May 21 2024

RKemper added a comment to T365400: Enable blocked commands in Zookeeper management interface.

Here's what it looked like before the puppet patch:

May 21 2024, 3:20 PM · Data-Platform-SRE (2024.05.06 - 2024.05.26)

May 8 2024

RKemper added a comment to T316876: wdqs: replace git-fat with git-lfs.

@dancy Should be fixed now. Here's a previously failing patch I ran a recheck on: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1028565/4#message-8d03a9f159d1ec58172e8606d84deaf43bbcbbb0

May 8 2024, 10:02 PM · Patch-For-Review, Data-Platform-SRE (2024.04.15 - 2024.05.05), git-lfs, Release-Engineering-Team (Priority Backlog 📥), Wikidata, Wikidata-Query-Service, Scap

May 7 2024

RKemper created T364368: Create separate pybal pools for wdqs graph split (main vs scholarly).
May 7 2024, 6:44 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Patch-For-Review, Discovery-Search, Wikidata-Query-Service, Wikidata
RKemper created T364367: Create dedicated UIs for wdqs graph split endpoints.
May 7 2024, 6:39 AM · collaboration-services, Data-Platform-SRE (2024.06.17 - 2024.07.07), Patch-For-Review, Discovery-Search, Wikidata-Query-Service, Wikidata
RKemper created T364366: Implement puppet logic for wdqs graph split.
May 7 2024, 6:33 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Discovery-Search, Wikidata-Query-Service, Wikidata
RKemper created T364364: Provision DNS and certificates for wdqs graph split domains .
May 7 2024, 6:28 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Patch-For-Review, Discovery-Search, Wikidata-Query-Service, Wikidata
RKemper created T364363: [Epic] Productionize federated wdqs graph-split endpoints.
May 7 2024, 6:25 AM · Data-Platform-SRE, Discovery-Search, Epic, Wikidata-Query-Service, Wikidata
RKemper added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

Kicked off the second run like so:

May 7 2024, 6:20 AM · Wikidata, Wikidata-Query-Service
RKemper added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

we saw that this took about 3702 minutes, or about 2.57 hours

Typo you'll want to fix here and in the original: 2.57 days

May 7 2024, 6:03 AM · Wikidata, Wikidata-Query-Service

May 2 2024

RKemper updated subscribers of T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

I'm realizing I don't remember enough about how we load specific graph splits (scholarly vs main). But it's possible we won't need the above nfs patch if our previous process was to manually download the relevant dump file.

May 2 2024, 7:48 PM · Wikidata, Wikidata-Query-Service
RKemper added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

Actually we'll use wdqs1021 since we're not confident NFS will work seamlessly between eqiad and codfw (we use nsf to procure the dumps that the data-reload is run from).

May 2 2024, 7:30 PM · Wikidata, Wikidata-Query-Service

May 1 2024

RKemper added a comment to T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors).

We'll use wdqs2023 to compare against wdqs1023 (same hardware). 1023 is scholarly-articles so that's what we'll want to load 2023 with.

May 1 2024, 4:49 PM · Wikidata, Wikidata-Query-Service

Apr 30 2024

RKemper moved T338009: Create dashboards for Search SLOs from In Progress to Needs Review on the Data-Platform-SRE (2024.04.15 - 2024.05.05) board.

Filled out the search preview SLI info in the documentation, and also updated the existing SLIs to make it more clear what exactly is being measured (page render, etc).

Apr 30 2024, 6:55 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Discovery-Search (Current work)
RKemper closed T361647: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks as Resolved.
Apr 30 2024, 3:28 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Patch-For-Review, cloud-services-team (FY2023/2024-Q3-Q4), Infrastructure-Foundations, SRE-tools, Spicerack
RKemper closed T361647: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks, a subtask of T345337: spicerack: tox fails to install PyYAML using python 3.11 on bookworm, as Resolved.
Apr 30 2024, 3:27 PM · cloud-services-team (FY2023/2024-Q3-Q4), Patch-For-Review, Infrastructure-Foundations, SRE-tools, Spicerack

Apr 27 2024

RKemper closed T346455: Allow federated queries with the NFDI4Culture Knowledge Graph as Resolved.

Here's an example curl command we can use to verify when the service is properly supporting application/sparql-results+xml:

Apr 27 2024, 4:48 AM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service

Apr 24 2024

RKemper closed T361268: Service implementation for elastic110[3-7] as Resolved.
Apr 24 2024, 9:38 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16)
RKemper updated the task description for T361268: Service implementation for elastic110[3-7].
Apr 24 2024, 9:36 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16)
RKemper updated the task description for T361268: Service implementation for elastic110[3-7].
Apr 24 2024, 7:07 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16)
RKemper updated the task description for T361268: Service implementation for elastic110[3-7].
Apr 24 2024, 6:54 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16)
RKemper closed T362983: Investigate/fix WDQS data-transfer cookbook as Resolved.
Apr 24 2024, 6:51 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)
RKemper moved T362983: Investigate/fix WDQS data-transfer cookbook from Backlog to Done on the Data-Platform-SRE (2024.04.15 - 2024.05.05) board.

Patched was merged and data-transfer reran; this is done.

Apr 24 2024, 6:50 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05)

Apr 18 2024

RKemper closed T361525: Degraded RAID on elastic2088 as Resolved.

Looks good on our end, thanks!

Apr 18 2024, 6:50 PM · ops-codfw, Data-Platform-SRE (2024.04.15 - 2024.05.05)

Apr 17 2024

RKemper updated subscribers of T361525: Degraded RAID on elastic2088.

Was able to get a puppet run on elastic2088, but since that run a couple hours ago the host is ssh unreachable (it hangs indefinitely). Seeing some concerning stuff in the drac via getsel on elastic2088.mgmt.codfw.wmnet:

Apr 17 2024, 2:52 AM · ops-codfw, Data-Platform-SRE (2024.04.15 - 2024.05.05)

Apr 15 2024

RKemper moved T338009: Create dashboards for Search SLOs from In Progress to Needs Review on the Data-Platform-SRE (2024.04.15 - 2024.05.05) board.

Added further context to the SLI section of the documentation explaining what each query type actually means. I believe there's no more oustanding TODOs on this task.

Apr 15 2024, 6:18 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Discovery-Search (Current work)
RKemper added a comment to T339347: qlever dblp endpoint for wikidata federated query nomination.

@RKemper Is your point that the queries should return a result? Neither DBLP nor Wikidata have the predicate foaf:name, so it's clear that both SERVICE queries return an empty result. Here is an example for a query that gives a result:

PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?editor ?editorName
WHERE {
  SERVICE <https://qlever.cs.uni-freiburg.de/api/wikidata> {
    wd:Q113544723 wdt:P179 ?editor.
    ?editor schema:name ?editorName.
  }
}
Apr 15 2024, 5:40 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata-Query-Service, Wikidata
RKemper added a comment to T362534: Make Elasticsearch rolling operation cookbook safer: default to one node per run.

I vote that we choose the default based on the cluster being operated on. So 3 for eqiad and codfw, 2 for cloudelastic and 1 for relforge.

Apr 15 2024, 5:25 PM · Data-Platform-SRE (2024.05.06 - 2024.05.26)

Apr 8 2024

RKemper moved T362080: decommission wdqs1025 from Backlog to Done on the Data-Platform-SRE (2024.03.25 - 2024.04.14) board.

Created subtask for dc-ops' side of the decom. Resolving this parent ticket now.

Apr 8 2024, 11:03 PM · Data-Platform-SRE (2024.03.25 - 2024.04.14)
RKemper added a subtask for T362080: decommission wdqs1025: T362122: decommission wdqs1025.eqiad.wmnet.
Apr 8 2024, 11:01 PM · Data-Platform-SRE (2024.03.25 - 2024.04.14)
RKemper added a parent task for T362122: decommission wdqs1025.eqiad.wmnet: T362080: decommission wdqs1025.
Apr 8 2024, 11:01 PM · ops-eqiad, SRE, decommission-hardware
RKemper created T362122: decommission wdqs1025.eqiad.wmnet.
Apr 8 2024, 11:01 PM · ops-eqiad, SRE, decommission-hardware

Apr 4 2024

RKemper moved T361525: Degraded RAID on elastic2088 from Backlog to Blocked / Waiting on the Data-Platform-SRE (2024.03.25 - 2024.04.14) board.
Apr 4 2024, 6:59 PM · ops-codfw, Data-Platform-SRE (2024.04.15 - 2024.05.05)

Apr 2 2024

RKemper closed T357533: Allow federated queries with the Iconclass sparql endpoint as Resolved.
Apr 2 2024, 3:45 PM · Data-Platform-SRE, Wikidata, Wikidata-Query-Service
RKemper closed T358882: Decommission elastic2037-2054 as Resolved.

With dc-ops having closed out the decom subtask, this should be all done.

Apr 2 2024, 6:12 AM · Data-Platform-SRE (2024.03.25 - 2024.04.14)
RKemper closed T358882: Decommission elastic2037-2054, a subtask of T353878: Service implementation for elastic2087-2109, as Resolved.
Apr 2 2024, 6:11 AM · Data-Platform-SRE (2024.03.25 - 2024.04.14)
RKemper claimed T361268: Service implementation for elastic110[3-7].
Apr 2 2024, 6:11 AM · Data-Platform-SRE (2024.05.27 - 2024.06.16)

Mar 28 2024

RKemper added a subtask for T358882: Decommission elastic2037-2054: T361305: decommission elastic20[37-54].codfw.wmnet.
Mar 28 2024, 8:41 PM · Data-Platform-SRE (2024.03.25 - 2024.04.14)
RKemper added a parent task for T361305: decommission elastic20[37-54].codfw.wmnet: T358882: Decommission elastic2037-2054.
Mar 28 2024, 8:41 PM · SRE, ops-codfw, decommission-hardware
RKemper created T361305: decommission elastic20[37-54].codfw.wmnet.
Mar 28 2024, 8:41 PM · SRE, ops-codfw, decommission-hardware
RKemper assigned T361286: Fatal error detected on elastic2088 to Papaul.
Mar 28 2024, 8:02 PM · SRE, ops-codfw, Data-Platform-SRE
RKemper added a comment to T353878: Service implementation for elastic2087-2109.

elastic2088 is unreachable and reported as missing from PuppetDB by Netbox report. No host should be powered on with puppet disabled or not working for longer period of time. Please either reimage it or shut it down now and reimage it at a later stage (before powering it on).

Mar 28 2024, 6:42 PM · Data-Platform-SRE (2024.03.25 - 2024.04.14)
RKemper updated the task description for T361268: Service implementation for elastic110[3-7].
Mar 28 2024, 5:56 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16)
RKemper added a comment to T358046: decommission cloudelastic100[1-4].wikimedia.org.

@RKemper Thanks for bringing this up! I missed running the script for this device. It's been run and decommissioned.

Mar 28 2024, 5:54 PM · SRE, ops-eqiad, decommission-hardware
RKemper created T361268: Service implementation for elastic110[3-7].
Mar 28 2024, 5:53 PM · Data-Platform-SRE (2024.05.27 - 2024.06.16)
RKemper reopened T358046: decommission cloudelastic100[1-4].wikimedia.org as "In Progress".

@VRiley-WMF In netbox I see cloudelastic1003 still listed as decommissioning, whereas the other cloudelastic hosts are marked as Offline. Is it just the step to set netbox status to Offline that we're missing or are there other steps that still need to be run on cloudelastic1003 as well?

Mar 28 2024, 5:40 PM · SRE, ops-eqiad, decommission-hardware

Mar 21 2024

RKemper added a comment to T360697: Investigate/fix broken apifeatureusage index deletion.

Glancing at ryankemper@apifeatureusage1001:~$ sudo journalctl -u curator_actions_apifeatureusage_eqiad:

Mar 21 2024, 8:46 PM · Discovery-Search, Data-Platform-SRE

Mar 20 2024

RKemper added a comment to T353845: decommission wdqs100[6-8].

Had forgotten to properly assign dc-ops as well as tag for the DC. Straightened that out now, so this should be ready for dc-ops to do the decom.

Mar 20 2024, 8:08 PM · SRE, ops-eqiad, decommission-hardware
RKemper assigned T353845: decommission wdqs100[6-8] to Jclark-ctr.
Mar 20 2024, 8:07 PM · SRE, ops-eqiad, decommission-hardware

Mar 15 2024

RKemper updated the task description for T358841: Adapt gitlab pipelines for the new wmf-jvm-parent-pom.
Mar 15 2024, 6:53 PM · Data-Platform-SRE (2024.03.25 - 2024.04.14), Release-Engineering-Team, Discovery-Search, Data-Engineering, Metrics Platform Backlog, Java-Scala-Standardization

Mar 13 2024

RKemper added a comment to T347034: RESTBase /v1/related endpoint should call the MW action API with a GET not a POST.

With https://github.com/wikimedia/restbase/pull/1336 being merged, is this ticket now resolved?

Mar 13 2024, 6:46 PM · API Platform, RESTBase Sunsetting, Essential-Work, Wikifeeds, Sustainability (Incident Followup), Discovery-Search

Mar 4 2024

RKemper updated the task description for T338009: Create dashboards for Search SLOs.
Mar 4 2024, 7:37 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Discovery-Search (Current work)

Mar 1 2024

RKemper moved T356651: Rebuild and deploy textify plugin from Needs Review to Done on the Data-Platform-SRE (2024.03.04 - 2024.03.24) board.

This should be all done; @TJones can you confirm all is working as it should?

Mar 1 2024, 9:42 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03), Discovery-Search (Current work)

Feb 29 2024

RKemper claimed T345337: spicerack: tox fails to install PyYAML using python 3.11 on bookworm.

@fnegri @brouberol Yeah, Brian and I will work on getting this tested and merged. Thanks for the heads up!

Feb 29 2024, 7:36 PM · cloud-services-team (FY2023/2024-Q3-Q4), Patch-For-Review, Infrastructure-Foundations, SRE-tools, Spicerack

Feb 21 2024

RKemper added a comment to T351488: Allow federated queries with the MiMoTextBase SPARQL endpoint.

@HinMar Okay, I think we've got the endpoints properly allowed. Queries appear to be working for me. Are you seeing the same?

Feb 21 2024, 10:45 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service
RKemper added a comment to T339347: qlever dblp endpoint for wikidata federated query nomination.

@Hannah_Bast Okay, we figured out what was making the allowed endpoints not updated properly. https://w.wiki/6q2i doesn't get an error message anymore, although the query itself returns no results.

Feb 21 2024, 10:43 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata-Query-Service, Wikidata
RKemper moved T346455: Allow federated queries with the NFDI4Culture Knowledge Graph from In Progress to Blocked / Waiting on the Data-Platform-SRE (2024.02.12 - 2024.03.03) board.
Feb 21 2024, 10:41 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service
RKemper added a comment to T346455: Allow federated queries with the NFDI4Culture Knowledge Graph.

@Loz.ross Yes the change to allowed endpoints did not get properly deployed; we've fixed that now. However there's another issue now that we've gotten past the Service URI not being allowed:

Feb 21 2024, 10:41 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service

Feb 20 2024

RKemper moved T357780: Decommission cloudelastic1001-1004 from Backlog to Done on the Data-Platform-SRE (2024.02.12 - 2024.03.03) board.
Feb 20 2024, 8:34 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03)
RKemper updated the task description for T358046: decommission cloudelastic100[1-4].wikimedia.org.
Feb 20 2024, 8:33 PM · SRE, ops-eqiad, decommission-hardware
RKemper updated the task description for T358046: decommission cloudelastic100[1-4].wikimedia.org.
Feb 20 2024, 8:25 PM · SRE, ops-eqiad, decommission-hardware
RKemper created T358046: decommission cloudelastic100[1-4].wikimedia.org.
Feb 20 2024, 8:24 PM · SRE, ops-eqiad, decommission-hardware
RKemper claimed T357533: Allow federated queries with the Iconclass sparql endpoint.
Feb 20 2024, 7:38 PM · Data-Platform-SRE, Wikidata, Wikidata-Query-Service

Feb 15 2024

RKemper added a comment to T356651: Rebuild and deploy textify plugin.

Finished the upload process; next up is rolling restart of cluster and merge of https://gitlab.wikimedia.org/repos/search-platform/cirrussearch-elasticsearch-image/-/merge_requests/7?commit_id=f1028a26dff38603bea67a8edda8337dab07bbfc

Feb 15 2024, 5:51 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03), Discovery-Search (Current work)
RKemper edited P19522 Plugin Upload Process.
Feb 15 2024, 5:15 PM · Discovery-Search (Current work)

Feb 6 2024

RKemper added a comment to T338009: Create dashboards for Search SLOs.

Added new threshold markers at 95% for the 4 SLO graphs. We may want to revise the % SLO upwards, but let's stick with 95% for now until we get another quarter of data.

Feb 6 2024, 7:46 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Discovery-Search (Current work)

Feb 1 2024

RKemper added a comment to T351488: Allow federated queries with the MiMoTextBase SPARQL endpoint.

@RKemper : Thank you for your message. The project has ended, but we still kindly ask you to whitelist this endpoint. We at the Trier Center for Digital Humanities will continue to work with LOD beyond this one project and would be delighted to be able to run federated queries starting from Wikidata directed towards the MiMoTextBase. We have developed an approach in MiMoText that we now want to transfer and adapt for other domains in a new project (“LODinG” – Linked Open Data in the Humanities). We are still very interested in the 'wikiverse' and in gaining as much experience as possible in the area of 'federation', which we see as the absolute key to the LOD vision. We are also planning to provide a showcase and if the whitelisting could be done relatively soon, we would like to include this new 'federation direction'. Can you estimate how long it will take?

Feb 1 2024, 7:44 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service
RKemper added a comment to T339347: qlever dblp endpoint for wikidata federated query nomination.

Yes, https://qlever.cs.uni-freiburg.de/api/dblp is the URL for API calls, whereas https://qlever.cs.uni-freiburg.de/dblp (without the /api) is the URL of the QLever UI. Same for all the other endpoints.

For example, https://qlever.cs.uni-freiburg.de/api/dblp?query=SELECT+%2A+WHERE+%7B+%3Fs+%3Fp+%3Fo+%7D+LIMIT+10 gives you the results for SELECT * WHERE { ?s ?p ?o } LIMIT 10 as application/sparql-results+json .

Feb 1 2024, 7:41 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata-Query-Service, Wikidata

Jan 31 2024

RKemper added a comment to T346455: Allow federated queries with the NFDI4Culture Knowledge Graph.

@Loz.ross Sorry for the delay, we've added the endpoint. Can you confirm it's working with an example query?

Jan 31 2024, 10:22 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service
RKemper added a comment to T339347: qlever dblp endpoint for wikidata federated query nomination.
Jan 31 2024, 10:22 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata-Query-Service, Wikidata
RKemper added a comment to T351488: Allow federated queries with the MiMoTextBase SPARQL endpoint.

@HinMar Sorry for missing this request - our bad! I see your earlier comment mentioned the project expiring by end of 2023. Is the project still ongoing and therefore we should still whitelist this new endpoint or should I instead close this ticket out?

Jan 31 2024, 10:19 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), Wikidata, Wikidata-Query-Service

Jan 30 2024

RKemper closed T355272: ProbeDown as Resolved.
Jan 30 2024, 4:54 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11)

Jan 25 2024

RKemper moved T351354: Service implementation for cloudelastic1007-1010 from In Progress to Blocked / Waiting on the Data-Platform-SRE (2024.01.22 - 2024.02.11) board.

Old masters are no longer master-eligible. They're still participating in the actual cluster; we're holding off on the physical decom until T355617 is done

Jan 25 2024, 10:48 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03)

Jan 24 2024

RKemper added a comment to T351354: Service implementation for cloudelastic1007-1010.

Forgot to add the Bug: label but https://gerrit.wikimedia.org/r/c/operations/puppet/+/992826 is part of this ticket as well

Jan 24 2024, 10:58 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03)

Jan 23 2024

RKemper closed T355593: Re-generate webserver-misc-apps.discovery.wmnet cergen certificate, a subtask of T351650: Expose 3 new dedicated WDQS endpoints, as Resolved.
Jan 23 2024, 7:25 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11), Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
RKemper closed T355593: Re-generate webserver-misc-apps.discovery.wmnet cergen certificate as Resolved.
Jan 23 2024, 7:25 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11), collaboration-services
RKemper moved T350464: Expose SPARQL endpoints with full wikidata data set and with split graph to enable experimentation on federation with a split graph from Blocked/Waiting to Needs Reporting on the Discovery-Search (Current work) board.

This should be all done, with the new experimental services accessible at:

Jan 23 2024, 5:19 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
RKemper updated the task description for T350464: Expose SPARQL endpoints with full wikidata data set and with split graph to enable experimentation on federation with a split graph.
Jan 23 2024, 5:17 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
RKemper moved T354658: Create 3 microsites for wdqs full graph, main graph, & scholarly articles from In Progress to Done on the Data-Platform-SRE (2024.01.22 - 2024.02.11) board.

Experimental microsites are up and externally reachable:

Jan 23 2024, 5:15 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11), Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
RKemper updated the task description for T354658: Create 3 microsites for wdqs full graph, main graph, & scholarly articles.
Jan 23 2024, 5:13 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11), Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
RKemper moved T355593: Re-generate webserver-misc-apps.discovery.wmnet cergen certificate from Backlog to Done on the Data-Platform-SRE (2024.01.22 - 2024.02.11) board.

We've rolled this out following the steps in https://wikitech.wikimedia.org/wiki/Cergen#Update_a_certificate

Jan 23 2024, 5:12 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11), collaboration-services
RKemper moved T351650: Expose 3 new dedicated WDQS endpoints from In Progress to Done on the Data-Platform-SRE (2024.01.22 - 2024.02.11) board.
Jan 23 2024, 5:11 PM · Data-Platform-SRE (2024.01.22 - 2024.02.11), Discovery-Search (Current work), Wikidata-Query-Service, Wikidata