User Details
- User Since
- Aug 21 2018, 6:05 PM (399 w, 14 h)
- Availability
- Available
- LDAP User
- Cwhite
- MediaWiki User
- CWhite (WMF) [ Global Accounts ]
Yesterday
Change is deployed and seems to pass the smoke test. Optimistically closing.
Wed, Apr 8
I overwrote that sector in the hope that the disk will reallocate it and OpenSearch correct any data discrepancies when it gets around to it.
I'm thinking disk read errors:
2026-04-07T04:56:16.083470+00:00 logging-hd2001 kernel: [1672676.995139] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:16.083496+00:00 logging-hd2001 kernel: [1672676.995151] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:16.083498+00:00 logging-hd2001 kernel: [1672676.995155] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:16.083499+00:00 logging-hd2001 kernel: [1672676.995164] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:16.083500+00:00 logging-hd2001 kernel: [1672676.995169] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:16.083502+00:00 logging-hd2001 kernel: [1672676.995175] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:16.083503+00:00 logging-hd2001 kernel: [1672676.995195] sd 0:0:1:0: [sdb] tag#6784 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s 2026-04-07T04:56:16.083504+00:00 logging-hd2001 kernel: [1672676.995204] sd 0:0:1:0: [sdb] tag#6784 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:16.083506+00:00 logging-hd2001 kernel: [1672676.995209] sd 0:0:1:0: [sdb] tag#6784 Add. Sense: Read retries exhausted 2026-04-07T04:56:16.083507+00:00 logging-hd2001 kernel: [1672676.995214] sd 0:0:1:0: [sdb] tag#6784 CDB: Read(16) 88 00 00 00 00 01 1e e7 30 00 00 00 04 00 00 00 2026-04-07T04:56:16.083508+00:00 logging-hd2001 kernel: [1672676.995218] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 2 2026-04-07T04:56:19.210392+00:00 logging-hd2001 kernel: [1672680.153496] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:19.210409+00:00 logging-hd2001 kernel: [1672680.153501] sd 0:0:1:0: [sdb] tag#6815 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s 2026-04-07T04:56:19.210410+00:00 logging-hd2001 kernel: [1672680.153504] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:19.210411+00:00 logging-hd2001 kernel: [1672680.153513] sd 0:0:1:0: [sdb] tag#6815 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:19.210413+00:00 logging-hd2001 kernel: [1672680.153518] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:19.210414+00:00 logging-hd2001 kernel: [1672680.153521] sd 0:0:1:0: [sdb] tag#6815 Add. Sense: Read retries exhausted 2026-04-07T04:56:19.210437+00:00 logging-hd2001 kernel: [1672680.153525] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:19.210438+00:00 logging-hd2001 kernel: [1672680.153528] sd 0:0:1:0: [sdb] tag#6815 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:19.210440+00:00 logging-hd2001 kernel: [1672680.153533] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:22.318883+00:00 logging-hd2001 kernel: [1672683.261973] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:22.318906+00:00 logging-hd2001 kernel: [1672683.261977] sd 0:0:1:0: [sdb] tag#6817 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:22.318909+00:00 logging-hd2001 kernel: [1672683.261981] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:22.318911+00:00 logging-hd2001 kernel: [1672683.261991] sd 0:0:1:0: [sdb] tag#6817 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:22.318912+00:00 logging-hd2001 kernel: [1672683.261997] sd 0:0:1:0: [sdb] tag#6817 Add. Sense: Read retries exhausted 2026-04-07T04:56:22.318914+00:00 logging-hd2001 kernel: [1672683.262003] sd 0:0:1:0: [sdb] tag#6817 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:22.318915+00:00 logging-hd2001 kernel: [1672683.262006] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:25.443753+00:00 logging-hd2001 kernel: [1672686.386824] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:25.443778+00:00 logging-hd2001 kernel: [1672686.386848] sd 0:0:1:0: [sdb] tag#6826 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:25.443781+00:00 logging-hd2001 kernel: [1672686.386861] sd 0:0:1:0: [sdb] tag#6826 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:25.443782+00:00 logging-hd2001 kernel: [1672686.386867] sd 0:0:1:0: [sdb] tag#6826 Add. Sense: Read retries exhausted 2026-04-07T04:56:25.443784+00:00 logging-hd2001 kernel: [1672686.386873] sd 0:0:1:0: [sdb] tag#6826 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:25.443785+00:00 logging-hd2001 kernel: [1672686.386877] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:28.518821+00:00 logging-hd2001 kernel: [1672689.461882] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:28.518837+00:00 logging-hd2001 kernel: [1672689.461908] sd 0:0:1:0: [sdb] tag#6846 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:28.518839+00:00 logging-hd2001 kernel: [1672689.461920] sd 0:0:1:0: [sdb] tag#6846 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:28.518841+00:00 logging-hd2001 kernel: [1672689.461926] sd 0:0:1:0: [sdb] tag#6846 Add. Sense: Read retries exhausted 2026-04-07T04:56:28.518842+00:00 logging-hd2001 kernel: [1672689.461931] sd 0:0:1:0: [sdb] tag#6846 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:28.518844+00:00 logging-hd2001 kernel: [1672689.461935] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:31.610758+00:00 logging-hd2001 kernel: [1672692.553802] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:31.610769+00:00 logging-hd2001 kernel: [1672692.553806] sd 0:0:1:0: [sdb] tag#6788 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:31.610772+00:00 logging-hd2001 kernel: [1672692.553816] sd 0:0:1:0: [sdb] tag#6788 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:31.610773+00:00 logging-hd2001 kernel: [1672692.553823] sd 0:0:1:0: [sdb] tag#6788 Add. Sense: Read retries exhausted 2026-04-07T04:56:31.610774+00:00 logging-hd2001 kernel: [1672692.553829] sd 0:0:1:0: [sdb] tag#6788 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:31.610776+00:00 logging-hd2001 kernel: [1672692.553832] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:34.819055+00:00 logging-hd2001 kernel: [1672695.762075] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:34.819080+00:00 logging-hd2001 kernel: [1672695.762081] sd 0:0:1:0: [sdb] tag#6789 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:34.819084+00:00 logging-hd2001 kernel: [1672695.762095] sd 0:0:1:0: [sdb] tag#6789 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:34.819086+00:00 logging-hd2001 kernel: [1672695.762100] sd 0:0:1:0: [sdb] tag#6789 Add. Sense: Read retries exhausted 2026-04-07T04:56:34.819087+00:00 logging-hd2001 kernel: [1672695.762106] sd 0:0:1:0: [sdb] tag#6789 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:34.819088+00:00 logging-hd2001 kernel: [1672695.762110] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:37.894085+00:00 logging-hd2001 kernel: [1672698.837089] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:37.894109+00:00 logging-hd2001 kernel: [1672698.837113] sd 0:0:1:0: [sdb] tag#6818 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:37.894112+00:00 logging-hd2001 kernel: [1672698.837126] sd 0:0:1:0: [sdb] tag#6818 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:37.894113+00:00 logging-hd2001 kernel: [1672698.837132] sd 0:0:1:0: [sdb] tag#6818 Add. Sense: Read retries exhausted 2026-04-07T04:56:37.894115+00:00 logging-hd2001 kernel: [1672698.837138] sd 0:0:1:0: [sdb] tag#6818 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:37.894116+00:00 logging-hd2001 kernel: [1672698.837141] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:41.027614+00:00 logging-hd2001 kernel: [1672701.970600] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) 2026-04-07T04:56:41.027638+00:00 logging-hd2001 kernel: [1672701.970624] sd 0:0:1:0: [sdb] tag#6829 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=3s 2026-04-07T04:56:41.027640+00:00 logging-hd2001 kernel: [1672701.970636] sd 0:0:1:0: [sdb] tag#6829 Sense Key : Medium Error [current] [descriptor] 2026-04-07T04:56:41.027642+00:00 logging-hd2001 kernel: [1672701.970642] sd 0:0:1:0: [sdb] tag#6829 Add. Sense: Read retries exhausted 2026-04-07T04:56:41.027643+00:00 logging-hd2001 kernel: [1672701.970647] sd 0:0:1:0: [sdb] tag#6829 CDB: Read(16) 88 00 00 00 00 01 1e e7 33 80 00 00 00 08 00 00 2026-04-07T04:56:41.027645+00:00 logging-hd2001 kernel: [1672701.970651] critical medium error, dev sdb, sector 4813435780 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 2026-04-07T04:56:42.866662+00:00 logging-hd2001 opensearch[378008]: fatal error in thread [opensearch[logging-hd2001-production-elk7-codfw][generic][T#22]], exiting 2026-04-07T04:56:42.867561+00:00 logging-hd2001 opensearch[378008]: java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code 2026-04-07T04:56:42.867655+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.store.BufferedChecksumIndexInput.readBytes(BufferedChecksumIndexInput.java:46) 2026-04-07T04:56:42.902773+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.store.DataInput.readBytes(DataInput.java:72) 2026-04-07T04:56:42.902903+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.store.ChecksumIndexInput.skipByReading(ChecksumIndexInput.java:79) 2026-04-07T04:56:42.937754+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.store.ChecksumIndexInput.seek(ChecksumIndexInput.java:64) 2026-04-07T04:56:42.937876+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:618) 2026-04-07T04:56:42.937946+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.lucene90.Lucene90PostingsReader.checkIntegrity(Lucene90PostingsReader.java:2049) 2026-04-07T04:56:42.938023+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsReader.checkIntegrity(Lucene90BlockTreeTermsReader.java:330) 2026-04-07T04:56:42.938114+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:370) 2026-04-07T04:56:42.938261+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.perfield.PerFieldMergeState$FilterFieldsProducer.checkIntegrity(PerFieldMergeState.java:296) 2026-04-07T04:56:42.938384+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:83) 2026-04-07T04:56:42.938459+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:205) 2026-04-07T04:56:42.938531+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:209) 2026-04-07T04:56:42.938623+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:298) 2026-04-07T04:56:42.938727+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:137) 2026-04-07T04:56:42.938841+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5140) 2026-04-07T04:56:42.946807+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4680) 2026-04-07T04:56:42.946917+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6432) 2026-04-07T04:56:42.947003+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639) 2026-04-07T04:56:42.947078+00:00 logging-hd2001 opensearch[378008]: #011at org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:120) 2026-04-07T04:56:43.004903+00:00 logging-hd2001 opensearch[378008]: #011at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700) 2026-04-07T04:56:44.157299+00:00 logging-hd2001 systemd[1]: opensearch_2@production-elk7-codfw.service: Main process exited, code=exited, status=128/n/a 2026-04-07T04:56:44.174262+00:00 logging-hd2001 systemd[1]: opensearch_2@production-elk7-codfw.service: Failed with result 'exit-code'. 2026-04-07T04:56:44.186310+00:00 logging-hd2001 systemd[1]: opensearch_2@production-elk7-codfw.service: Consumed 6h 53min 12.273s CPU time.
Fri, Apr 3
The istio-system namespace is logging ~980 events/sec. Many are just istio-ingressgateway for authority:page-analytics.discovery.wmnet (~833 events/sec).
Wed, Mar 25
I'd like to propose a rename to logging-kafka so that these hosts follow the other logging-* hosts indicating its role in the larger cluster.
I'd like to propose a rename to logging-kafka so that these hosts follow the other logging-* hosts indicating its role in the larger cluster.
Mon, Mar 16
Mar 13 2026
Mar 12 2026
Last thing the old rackspace host handles is [301] http[s]://wikimediastatus.net -> https://www.wikimediastatus.net
Mar 11 2026
The files I'd like to serve can be found in swift at https://ms-fe.svc.eqiad.wmnet/v1/AUTH_performance/arclamp-(logs|svgs)-(hourly|daily)
Some napkin math and a 3-year retention period (currently configured) yields 2,023,560 files. and ~634GB of data. Files range from a few hundred bytes to a little over a hundred megabytes.
Mar 6 2026
Poolcounter logs are greatly reduced post-deploy.
$ zcat poolcounter.log-20260304*.gz | wc -l 844298029 $ zcat poolcounter.log-20260305*.gz | wc -l 73026
Mar 2 2026
Feb 24 2026
Feb 23 2026
As part of T376400: Redesign wikitech-static, the backup wikitech moved away from the legacy wikitech-static host. Part of that process was pointing wikitech-static.wikimedia.org away from the legacy host to its new home. This made it impossible for certbot to complete its certificate renewal for a host that it no longer served. This renewal discontinuity interrupted wikimediastatus.net and status.wikimedia.org renewals.
Jan 28 2026
Jan 26 2026
Jan 22 2026
I've added monitoring and alerting to the new cname record. Considering this done!
Jan 21 2026
Was bold and made this change. Will direct here for further discussion if there are other opinions.
Jan 15 2026
I'm +1 for this change. Usually, I'm looking at the network panels to see where utilization is relative to the link speed. It'd be nice to not have to manually calculate it.
Thank you!!
Jan 14 2026
For context, the outage was caused by saturated nics on the titan hosts.
For visibility, the outage today was a "grafana consumed all the memory" condition. https://grafana-next.wikimedia.org which links to the read-only backup in the standby DC remained available and was used in to diagnose the primary grafana host.
Jan 12 2026
We (observability) asked about this in the all-SRE meeting and most preferred to 301 status.wm.o -> wikimediastatus.net on the grounds that there are many Wikimedia properties using third-party tools and hosting with different privacy policies. Another option presented was to host a small static page on miscweb.
Jan 6 2026
Puppet is running on icinga again after that last patch.
Dec 23 2025
Dec 22 2025
Dec 18 2025
I learned some things.
Dec 17 2025
Done! Let us know if something is amiss!
Dec 16 2025
I've reset the timezone back to UTC in the settings.
Dec 11 2025
Record is in place and config change to scap is deployed.
Will watch memory usage and filesystem usage in the coming weeks.
Dec 9 2025
List of affected hosts:
asw1-b3-magru asw1-b4-magru asw1-bw27-esams asw1-by27-esams cloudsw1-b1-codfw lsw1-a2-codfw lsw1-a3-codfw lsw1-a4-codfw lsw1-a5-codfw lsw1-a6-codfw lsw1-a7-codfw lsw1-a8-codfw lsw1-b2-codfw lsw1-b3-codfw lsw1-b4-codfw lsw1-b5-codfw lsw1-b6-codfw lsw1-b7-codfw lsw1-b8-codfw lsw1-c1-codfw lsw1-c2-codfw lsw1-c3-codfw lsw1-c4-codfw lsw1-c5-codfw lsw1-c6-codfw lsw1-c7-codfw lsw1-d1-codfw lsw1-d2-codfw lsw1-d3-codfw lsw1-d4-codfw lsw1-d5-codfw lsw1-d6-codfw lsw1-d7-codfw lsw1-d8-codfw lsw1-e1-eqiad lsw1-e2-eqiad lsw1-e3-eqiad lsw1-e5-eqiad lsw1-e6-eqiad lsw1-e7-eqiad lsw1-e8-eqiad lsw1-f1-eqiad lsw1-f2-eqiad lsw1-f3-eqiad lsw1-f5-eqiad lsw1-f6-eqiad lsw1-f7-eqiad lsw1-f8-eqiad ssw1-a1-codfw ssw1-a8-codfw ssw1-d1-codfw ssw1-d8-codfw ssw1-e1-eqiad ssw1-f1-eqiad
Dec 5 2025
Dec 4 2025
For awareness, there's also a regression affecting annotations in Grafana: https://github.com/grafana/grafana/issues/110265
Dec 3 2025
Done! Let us know if something is amiss!
For batch-job oriented applications, we run a Prometheus PushGateway instance which may be an option for this. A Prometheus metrics endpoint in a stateful service is the ideal, though. Is there an opportunity here with spiderpig being a daemon service and could serve as the source of this data? (Guessing myself: probably not, but it can't hurt to ask the experts. 🙂)
Dec 2 2025
The replacement for this annotation tool is to use the Public Logs datasource in Grafana which is backed by Loki. Please let us know if the Observability team can be of further assistance.
Nov 25 2025
component/opensearch27 provisioned and beta-logs is downgraded again.
Nov 21 2025
Nov 20 2025
I've removed the rsync job that I suspect was causing Loki to panic regularly.
Nov 14 2025
Thank you!
Nov 6 2025
In today's case, the alert criteria wasn't met because the metrics went missing.
Nov 5 2025
The network ACLs and the scap config were updated to use the newer host: logging-logstash-04.
Nov 4 2025
Oct 28 2025
Possibly not an infeasible amount of maintenance overhead. Depends on how stable the data source is and how performant it can be made. Performance testing is needed.
That seems problematic even with IPoid. I guess this is a no-go then?
I wouldn't say it's no-go because of that. Isolating the stream is possible and so is a separate data enrichment step. Isolating the stream would give us a more concrete picture of the load this feature will induce on the Spur data provider (whatever form it takes).
Is there maybe a way to do such post-processing inside the OpenSearch index?
Post-processing indexed events is less preferable as it introduces tombstoning overhead. We'd like to avoid that. It is far better to inject the data into the event before it reaches OpenSearch for storage. Knowing the volume of the isolated stream would help us asses the impact.
I see @kostajh wrote about how to import the Spur data to OpenSearch, so maybe there's an intent to do that anyway and we'd only have to connect the two indexes somehow?
OpenSearch has no JOIN clause. Depending on the size of the data Spur gives us, I do see the possibility of the separate data enrichment pipeline pulling data backed by an OpenSearch index, though. Performance testing is needed.
Oct 27 2025
Oct 22 2025
Oct 16 2025
Based on my understanding of what DMARC reports contain, I think this data is fine to store in Logstash. How I interpret the recommendation to "keep email logs out of logstash", is about preventing leaks of data and metadata about private communications.
Oct 15 2025
We (observability) provide the infrastructure, but this request looks like a mediawiki config setting. I believe the linked patch would fulfill the ask, but someone else should make the decision on whether it is the right change.
Please correct me if I'm wrong, but is this link generated and displayed by requestctl?
Do you happen to have a trixie host available that we can try the existing package on?
Oct 14 2025
A few things come to mind immediately.
- We don't want to introduce an external (to us) dependency. Logstash has no internet access and it shouldn't have it. (I am guessing the workaround is to use IPoid?)
- We don't want to introduce additional latency to the overall pipeline. The desired stream ought to be isolated so that delays induced by the external data provider do not add latency for other tenants.
- Logstash handles thousands of events per second which translates into requests against the data source. Whatever handles the data should be able to absorb this extra load.
- The Logstash SLO may need to be tuned to handle the introduction of an external dependency.
