Page MenuHomePhabricator

tappof (Tiziano Fogli)
User

Projects (12)

Today

  • No visible events.

Tomorrow

  • No visible events.

Monday

  • No visible events.

User Details

User Since
Jul 23 2024, 9:16 AM (90 w, 4 d)
Availability
Available
IRC Nick
tappof
LDAP User
Tiziano Fogli
MediaWiki User
Tiziano Fogli [ Global Accounts ]

Recent Activity

Wed, Apr 15

tappof moved T422232: PrometheusZombieSeriesDetected from Inbox to Backlog on the Observability-Metrics board.
Wed, Apr 15, 1:53 PM · Observability-Metrics, observability
tappof added a project to T422232: PrometheusZombieSeriesDetected: Observability-Metrics.
Wed, Apr 15, 1:53 PM · Observability-Metrics, observability
tappof moved T422232: PrometheusZombieSeriesDetected from Inbox to Radar on the observability board.
Wed, Apr 15, 1:52 PM · Observability-Metrics, observability
tappof closed T386911: Increase thanos compact capacity for shorter cycle times as Resolved.

The multi-instance Thanos compactor has been deployed: Prometheus instances are assigned to compactor instances on the titan hosts via the prometheus::instances Hiera variable.

Wed, Apr 15, 1:47 PM · Patch-For-Review, Observability-Metrics

Fri, Apr 10

tappof created P90345 (An Untitled Masterwork).
Fri, Apr 10, 8:00 AM

Thu, Apr 9

tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
root@titan2001:/srv/rewrite# cat /tmp/tbd  | awk '{print $2}' | xargs -I % thanos tools bucket mark --id=% --marker=deletion-mark.json --details="manual deletion" --objstore.config-file=/etc/thanos-store@main/objstore.yaml
ts=2026-04-09T07:31:33.471510733Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-04-09T07:31:33.842489347Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KNB5NY6PMVSNTAAZHX6VE5CQ
ts=2026-04-09T07:31:33.842528357Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KNB5NY6PMVSNTAAZHX6VE5CQ
ts=2026-04-09T07:31:33.842585231Z caller=main.go:174 level=info msg=exiting
Thu, Apr 9, 7:36 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Wed, Apr 8

tappof added a comment to T422114: Prometheus doing requests via proxy that gets 400.

This has been done on purpose (see modules/profile/manifests/installserver/proxy.pp:65). I believe another approach, to check for a 200 OK, could have been to allow Prometheus hosts, via an ACL, to query the :8080/squid-internal-mgr/info endpoint.

Wed, Apr 8, 10:17 AM · SRE Observability (FY2025/2026-Q4)

Wed, Apr 1

tappof created T422020: Yubikey-SSH-FIDO for Tiziano Fogli (tappof / BACKUP).
Wed, Apr 1, 12:47 PM · SRE, SRE-Access-Requests
tappof triaged T420699: PrometheusSeriesCreationRateAnomalyHigh as Low priority.
Wed, Apr 1, 9:41 AM · Observability-Metrics, observability

Tue, Mar 31

tappof added a comment to T420676: Allow Prometheus query beyond 375 days in Grafana/Thanos.

Yes, as reported by volans, this has been done on purpose. The current maximum range length is 365 days, plus an additional 10 days to allow comparison over a 10-day window across one year. Unfortunately, there’s no way to tune the parameter on a per-query basis.
Anyway, the suggestion of adding a second query to the panel (or a second panel) with the offset query is a valid one.
If needed, I think we can add a few hours to the limit to reach a window of 1 year and 1 month.

Tue, Mar 31, 3:33 PM · Regression, Observability-Metrics, observability, Grafana
tappof created P90028 (An Untitled Masterwork).
Tue, Mar 31, 9:07 AM

Mon, Mar 30

tappof created P89965 prometheus ferm -> nft.
Mon, Mar 30, 7:36 AM

Fri, Mar 27

tappof added a comment to T420698: PrometheusSeriesCountAnomalyHigh.
topk(1000,
  count by (metric_name) (
    label_replace({__name__=~".+", job="k8s-pods"}, "metric_name", "$1", "__name__", "(.+)")
  )
  -
  (
    count by (metric_name) (
      label_replace({__name__=~".+", job="k8s-pods"} offset 10d, "metric_name", "$1", "__name__", "(.+)")
    )
    or
    (0 * count by (metric_name) (
      label_replace({__name__=~".+", job="k8s-pods"}, "metric_name", "$1", "__name__", "(.+)")
    ))
  )
)
Fri, Mar 27, 4:48 PM · observability
tappof created T421517: Alert in need of triage: AlertLintProblem (instance localhost:9123).
Fri, Mar 27, 4:23 PM · Patch-For-Review, Infrastructure-Foundations, sre-alert-triage
tappof added a comment to T420698: PrometheusSeriesCountAnomalyHigh.

The ext alert is likely related to the DC switchover.

Fri, Mar 27, 4:12 PM · observability
tappof moved T420699: PrometheusSeriesCreationRateAnomalyHigh from Inbox to Backlog on the Observability-Metrics board.
Fri, Mar 27, 2:12 PM · Observability-Metrics, observability
tappof added a project to T420699: PrometheusSeriesCreationRateAnomalyHigh: Observability-Metrics.
Fri, Mar 27, 2:11 PM · Observability-Metrics, observability
tappof added a comment to T420699: PrometheusSeriesCreationRateAnomalyHigh.

Related to the DC switchover. This will be resolved once the seasonality approach has enough data to correctly compute the standard pattern.

Fri, Mar 27, 2:10 PM · Observability-Metrics, observability

Thu, Mar 26

tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-26T22:36:19.520898876Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T22:36:20.136700616Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1B31QZ82XFTBRW8XM5B10F
ts=2026-03-26T22:36:20.136731522Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1B31QZ82XFTBRW8XM5B10F
ts=2026-03-26T22:36:20.136763999Z caller=main.go:174 level=info msg=exiting
ts=2026-03-26T22:36:20.161640077Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T22:36:20.782733716Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KK5A391P07EHN0QRWMPY9MMG
ts=2026-03-26T22:36:20.782769235Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KK5A391P07EHN0QRWMPY9MMG
ts=2026-03-26T22:36:20.782808922Z caller=main.go:174 level=info msg=exiting
Thu, Mar 26, 10:36 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-26T20:40:55.969210977Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T20:40:56.573109549Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KK4MX9QZMCR6JHKV25C8TCFN
ts=2026-03-26T20:40:56.573155708Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KK4MX9QZMCR6JHKV25C8TCFN
ts=2026-03-26T20:40:56.57319678Z caller=main.go:174 level=info msg=exiting
ts=2026-03-26T20:40:56.60021435Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T20:40:57.320865233Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KK4XGGQCB44P5D5XTCN8YXGY
ts=2026-03-26T20:40:57.32092733Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KK4XGGQCB44P5D5XTCN8YXGY
ts=2026-03-26T20:40:57.320999165Z caller=main.go:174 level=info msg=exiting
Thu, Mar 26, 8:41 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-26T17:34:10.457880981Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T17:34:11.069489832Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ2W68ZNPN16MXB7D283WS31
ts=2026-03-26T17:34:11.069538396Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ2W68ZNPN16MXB7D283WS31
ts=2026-03-26T17:34:11.069588421Z caller=main.go:174 level=info msg=exiting
ts=2026-03-26T17:34:11.121816863Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T17:34:11.826060123Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KK0E1DGZWDEF0JJ1YF9T8SXB
ts=2026-03-26T17:34:11.826095643Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KK0E1DGZWDEF0JJ1YF9T8SXB
ts=2026-03-26T17:34:11.826132829Z caller=main.go:174 level=info msg=exiting
ts=2026-03-26T17:34:11.852982441Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-26T17:34:12.491859477Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KK4EHGC2VAASWWWS6KJ5ACHT
ts=2026-03-26T17:34:12.491908721Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KK4EHGC2VAASWWWS6KJ5ACHT
ts=2026-03-26T17:34:12.49194897Z caller=main.go:174 level=info msg=exiting
Thu, Mar 26, 5:34 PM · SRE Observability (FY2025/2026-Q3)

Wed, Mar 25

tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
ts=2026-03-25T20:32:04.70702737Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-25T20:32:05.423033143Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KDG1XXFWVF3WK02GNR8QF44Z
ts=2026-03-25T20:32:05.423071023Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KDG1XXFWVF3WK02GNR8QF44Z
ts=2026-03-25T20:32:05.423098889Z caller=main.go:174 level=info msg=exiting
ts=2026-03-25T20:32:05.450567006Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-25T20:32:06.036279314Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KE1TX1X3A0S0WXS3MY0TX42C
ts=2026-03-25T20:32:06.03630991Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KE1TX1X3A0S0WXS3MY0TX42C
ts=2026-03-25T20:32:06.03635356Z caller=main.go:174 level=info msg=exiting
Wed, Mar 25, 8:32 PM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
ts=2026-03-25T16:28:22.261487795Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-25T16:28:22.912255714Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KDFXRVABR9ZYGX7B4KVNPZZ2
ts=2026-03-25T16:28:22.91228853Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KDFXRVABR9ZYGX7B4KVNPZZ2
ts=2026-03-25T16:28:22.912330496Z caller=main.go:174 level=info msg=exiting
ts=2026-03-25T16:28:22.938868523Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-25T16:28:23.538771637Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KE19B43Q05H5YKNVDXABNASC
ts=2026-03-25T16:28:23.53880178Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KE19B43Q05H5YKNVDXABNASC
ts=2026-03-25T16:28:23.538842313Z caller=main.go:174 level=info msg=exiting
Wed, Mar 25, 4:49 PM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof updated the task description for T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
Wed, Mar 25, 4:29 PM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
root@titan2001:/srv/rewrite# thanos tools bucket mark --id=01KE19BDY37QPBB3EM4P6S9NCR --marker=deletion-mark.json --details="manual deletion" --objstore.config-file=/etc/thanos-store@main/objstore.yaml
ts=2026-03-25T11:39:53.211582749Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-25T11:39:53.84818465Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KE19BDY37QPBB3EM4P6S9NCR
ts=2026-03-25T11:39:53.848216495Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KE19BDY37QPBB3EM4P6S9NCR
ts=2026-03-25T11:39:53.848245804Z caller=main.go:174 level=info msg=exiting
root@titan2001:/srv/rewrite# thanos tools bucket mark --id=01KDFFD341TD2SVKMMHQD40ZWH --marker=deletion-mark.json --details="manual deletion" --objstore.config-file=/etc/thanos-store@main/objstore.yaml
ts=2026-03-25T11:40:06.965712946Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-25T11:40:07.630262639Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KDFFD341TD2SVKMMHQD40ZWH
ts=2026-03-25T11:40:07.630313034Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KDFFD341TD2SVKMMHQD40ZWH
ts=2026-03-25T11:40:07.630362639Z caller=main.go:174 level=info msg=exiting
Wed, Mar 25, 12:00 PM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Tue, Mar 24

tappof closed T419713: thanos swift capacity for FY 26/27 as Resolved.

I filed a dedicated task (T421078: Offload more queries to remote Prometheus instances to improve performance for fresh data queries) for offloading queries to remote instances with SSD disks. I think we can safely close this task.
Thank you all.

Tue, Mar 24, 2:10 PM · SRE-swift-storage, SRE, Observability-Metrics
tappof created T421078: Offload more queries to remote Prometheus instances to improve performance for fresh data queries.
Tue, Mar 24, 1:49 PM · Observability-Metrics
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
root@titan2001:/srv/rewrite# cat /tmp/tbd  | awk '{print $2}' | xargs -I % thanos tools bucket mark --id=% --marker=deletion-mark.json --details="manual deletion" --objstore.config-file=/etc/thanos-store@main/objstore.yaml
ts=2026-03-24T11:09:50.087215191Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:50.575287996Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K77Z18DCCAZ5DN2J7JC2BRSE
ts=2026-03-24T11:09:50.575334267Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K77Z18DCCAZ5DN2J7JC2BRSE
ts=2026-03-24T11:09:50.575377974Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:50.683677179Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:51.156591982Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K7FFWA1PBYJBEYHMERVBW9RT
ts=2026-03-24T11:09:51.156636349Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K7FFWA1PBYJBEYHMERVBW9RT
ts=2026-03-24T11:09:51.156674301Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:51.183305557Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:51.586508715Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K89S2TETQZ37B8QYTNDC5ZZ4
ts=2026-03-24T11:09:51.586560716Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K89S2TETQZ37B8QYTNDC5ZZ4
ts=2026-03-24T11:09:51.586622419Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:51.667843693Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:52.137775195Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K8EHQY4CAH37XEEBMG7ZKS5R
ts=2026-03-24T11:09:52.137810135Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K8EHQY4CAH37XEEBMG7ZKS5R
ts=2026-03-24T11:09:52.137843255Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:52.164801174Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:52.575384744Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K8R53J26TQ47S3CFJKMQ6800
ts=2026-03-24T11:09:52.575426048Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K8R53J26TQ47S3CFJKMQ6800
ts=2026-03-24T11:09:52.575456606Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:52.601851061Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:53.006006113Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K9DPRXRK001FEEFGKXNPV3KA
ts=2026-03-24T11:09:53.006041806Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K9DPRXRK001FEEFGKXNPV3KA
ts=2026-03-24T11:09:53.006074996Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:53.031886293Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:53.429814669Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K9J9G0JRM88D1M5V60AS0GZX
ts=2026-03-24T11:09:53.429854849Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K9J9G0JRM88D1M5V60AS0GZX
ts=2026-03-24T11:09:53.429893244Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:53.48479205Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:53.937137224Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01K9W626EB339WQYPQWD88V8CM
ts=2026-03-24T11:09:53.937191545Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01K9W626EB339WQYPQWD88V8CM
ts=2026-03-24T11:09:53.93723884Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:54.037997176Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:54.965401694Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KAKQQFWE0YPN8537MNDD09ST
ts=2026-03-24T11:09:54.965436617Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KAKQQFWE0YPN8537MNDD09ST
ts=2026-03-24T11:09:54.9654747Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:54.991668467Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:55.451897055Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KAQKTADP36AQ6SAC348HANZ7
ts=2026-03-24T11:09:55.451934896Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KAQKTADP36AQ6SAC348HANZ7
ts=2026-03-24T11:09:55.451973589Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:55.478759572Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:55.925955245Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KAQWFR86842R0KY8SN3XBS5E
ts=2026-03-24T11:09:55.926015571Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KAQWFR86842R0KY8SN3XBS5E
ts=2026-03-24T11:09:55.926061811Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:55.953665231Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:56.330940465Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KB50ZTP6V0MJK9TFWVYFCP7S
ts=2026-03-24T11:09:56.330973047Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KB50ZTP6V0MJK9TFWVYFCP7S
ts=2026-03-24T11:09:56.331006568Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:56.357320892Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:56.781324227Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KBPXN7DSNN21RJVCYR4ATATS
ts=2026-03-24T11:09:56.781359973Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KBPXN7DSNN21RJVCYR4ATATS
ts=2026-03-24T11:09:56.781389577Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:56.811239593Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:57.226443197Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KBT7NTTM34BT6Z6A9NNE8CVR
ts=2026-03-24T11:09:57.226480926Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KBT7NTTM34BT6Z6A9NNE8CVR
ts=2026-03-24T11:09:57.226529087Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:57.253404352Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:58.229291113Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KDCY1BF086W0HWT2GB6254S8
ts=2026-03-24T11:09:58.229338356Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KDCY1BF086W0HWT2GB6254S8
ts=2026-03-24T11:09:58.229370056Z caller=main.go:174 level=info msg=exiting
ts=2026-03-24T11:09:58.257955016Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-24T11:09:58.678176449Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KJSX5S689WR0P3MEV14ESJ5F
ts=2026-03-24T11:09:58.678227573Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KJSX5S689WR0P3MEV14ESJ5F
ts=2026-03-24T11:09:58.678266377Z caller=main.go:174 level=info msg=exiting
Tue, Mar 24, 11:10 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Mon, Mar 23

tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
root@titan2001:/srv/thanos-compact# zgrep halt /var/log/syslog.3.gz
2026-03-20T12:42:52.008762+00:00 titan2001 thanos-compact[191194]: ts=2026-03-20T12:42:52.008474762Z caller=compact.go:559 level=error msg="critical error detected; halting" err="compaction: group 300000@5591632650960416942: pre compaction overlap check: overlaps found while gathering blocks. [mint: 1764806400000, maxt: 1766016000000, range: 336h0m0s, blocks: 2]: <ulid: 01KDCTV95JKV1RG3AKXNMF4J6Z, mint: 1764806400000, maxt: 1766016000000, range: 336h0m0s>, <ulid: 01KK2HC980TDNXPF6X8MVCSXFM, mint: 1764806400000, maxt: 1766016000000, range: 336h0m0s>"
2026-03-20T19:41:04.858930+00:00 titan2001 thanos-compact[323313]: ts=2026-03-20T19:41:04.858742541Z caller=compact.go:559 level=error msg="critical error detected; halting" err="compaction: 2 errors: group 300000@5591632650960416942: pre compaction overlap check: overlaps found while gathering blocks. [mint: 1764806400000, maxt: 1766016000000, range: 336h0m0s, blocks: 2]: <ulid: 01KDCTV95JKV1RG3AKXNMF4J6Z, mint: 1764806400000, maxt: 1766016000000, range: 336h0m0s>, <ulid: 01KK2HC980TDNXPF6X8MVCSXFM, mint: 1764806400000, maxt: 1766016000000, range: 336h0m0s>; group 300000@2015487672410861213: upload of 01KM637K1EH1FMVXZFG2VK7YAJ failed: failed to clean block after upload issue. Partial block in system. Err: upload index: upload file /srv/thanos-compact/compact/300000@2015487672410861213/01KM637K1EH1FMVXZFG2VK7YAJ/index as 01KM637K1EH1FMVXZFG2VK7YAJ/index: upload s3 object: Put \"https://thanos-swift.discovery.wmnet/thanos/01KM637K1EH1FMVXZFG2VK7YAJ/index?partNumber=586&uploadId=Yzg4MTY2MjMtZmI2NC00NTRiLWE5ZjUtMGY0ZTNjNjg2MDNj\": write tcp 10.192.32.160:51424->10.2.1.54:443: write: broken pipe: upload index: upload file /srv/thanos-compact/compact/300000@2015487672410861213/01KM637K1EH1FMVXZFG2VK7YAJ/index as 01KM637K1EH1FMVXZFG2VK7YAJ/index: upload s3 object: Put \"https://thanos-swift.discovery.wmnet/thanos/01KM637K1EH1FMVXZFG2VK7YAJ/index?partNumber=586&uploadId=Yzg4MTY2MjMtZmI2NC00NTRiLWE5ZjUtMGY0ZTNjNjg2MDNj\": write tcp 10.192.32.160:51424->10.2.1.54:443: write: broken pipe"
Mon, Mar 23, 10:18 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Fri, Mar 20

tappof added a comment to T419713: thanos swift capacity for FY 26/27.

We will further discuss internally the option you suggested and the available ways to implement our idea with Thanos next Monday afternoon (European time) and will let you know. Thank you so much.

Fri, Mar 20, 4:45 PM · SRE-swift-storage, SRE, Observability-Metrics
tappof added a comment to T419713: thanos swift capacity for FY 26/27.

Blocks from January/February 2026 occupy roughly 50 TiB, as they haven’t been downsampled.

Fri, Mar 20, 3:40 PM · SRE-swift-storage, SRE, Observability-Metrics
tappof added a comment to T419713: thanos swift capacity for FY 26/27.

To incorporate @herron’s comment, we’re exploring a couple of ideas to keep fresh blocks (60-90 days) from Prometheus instances in an SSD-backed bucket. The existing HDD-backed bucket would remain in use alongside the hypothetical new one.

Fri, Mar 20, 11:09 AM · SRE-swift-storage, SRE, Observability-Metrics
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
root@titan2001:/srv/rewrite/analyze# cat /tmp/tbd
| 01KJFRKS6R61D6CADD4KGABTYR | 2025-09-25T00:00:00Z | 2025-10-09T00:00:00Z | 336h0m0s       | -96h0m0s        | 238,522,658 | 37,748,034,658  | 468,900,192   | 6          | false       | prometheus=k8s,replica=d,site=codfw           | 5m0s       | compactor      |
| 01KK07QBXHAV0RR99NQZ4Y0XYD | 2025-10-01T00:00:00Z | 2025-10-03T00:00:00Z | 47h59m59.999s  | 192h0m0.001s    | 114,013,389 | 6,239,350,000   | 145,442,702   | 3          | false       | prometheus=k8s,replica=d,site=codfw           | 5m0s       | compactor      |
root@titan2001:/srv/rewrite/analyze# thanos tools bucket mark --id=01KK07QBXHAV0RR99NQZ4Y0XYD --marker=deletion-mark.json --details="manual deletion" --objstore.config-file=/etc/thanos-store@main/objstore.yaml
ts=2026-03-19T16:24:03.83798136Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-19T16:24:04.235801909Z caller=block.go:203 level=info msg="block has been marked for deletion" block=01KK07QBXHAV0RR99NQZ4Y0XYD
ts=2026-03-19T16:24:04.235834415Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=deletion-mark.json IDs=01KK07QBXHAV0RR99NQZ4Y0XYD
ts=2026-03-19T16:24:04.235878239Z caller=main.go:174 level=info msg=exiting
Fri, Mar 20, 8:15 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Mar 12 2026

tappof created T419861: Alert in need of triage: SmartNotHealthy (instance aqs1015:9100).
Mar 12 2026, 2:44 PM · Data-Persistence, sre-alert-triage
tappof created T419859: Alert in need of triage: PeeringBGPDown (instance cr3-eqsin:9804).
Mar 12 2026, 2:40 PM · netops, Infrastructure-Foundations, sre-alert-triage
tappof created T419858: Alert in need of triage: PeeringBGPDown (instance cr3-eqsin:9804).
Mar 12 2026, 2:40 PM · netops, Infrastructure-Foundations, sre-alert-triage
tappof created T419857: Alert in need of triage: PeeringBGPDown (instance cr1-esams:9804).
Mar 12 2026, 2:40 PM · netops, Infrastructure-Foundations, sre-alert-triage
tappof created T419856: Alert in need of triage: PeeringBGPDown (instance cr1-esams:9804).
Mar 12 2026, 2:39 PM · netops, Infrastructure-Foundations, sre-alert-triage
tappof created T419855: Alert in need of triage: PeeringBGPDown (instance cr3-eqsin:9804).
Mar 12 2026, 2:39 PM · Infrastructure-Foundations, netops, sre-alert-triage
tappof created T419854: Alert in need of triage: PeeringBGPDown (instance cr3-eqsin:9804).
Mar 12 2026, 2:39 PM · Infrastructure-Foundations, netops, sre-alert-triage
tappof updated the task description for T419647: Eqiad: lsw1-d2-eqiad BGP maintenance.
Mar 12 2026, 10:05 AM · netops, Infrastructure-Foundations, SRE

Mar 11 2026

tappof closed T419430: Migrate prometheus4002 to prometheus4003, a subtask of T418993: Migrating ulsfo to routed Ganeti, as Resolved.
Mar 11 2026, 5:04 PM · Patch-For-Review, collaboration-services, Ganeti, Infrastructure-Foundations, SRE
tappof closed T419430: Migrate prometheus4002 to prometheus4003 as Resolved.
Mar 11 2026, 5:04 PM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof added a comment to T392886: Revisit default Istio histogram buckets.

My suggestion would be to remove boundary edges with a ratio below 1% and aggregate those with similar ratios. Ideally, keeping only boundaries close to widely adopted quantiles [1(maybe), 50, 75 (maybe), 90, 95, 99] would help reduce a lot of dead weight.

Mar 11 2026, 5:00 PM · ServiceOps new, SRE Observability (FY2025/2026-Q1), Patch-For-Review, Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 11 2026, 12:34 PM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 11 2026, 11:25 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 11 2026, 11:13 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 11 2026, 11:13 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 11 2026, 10:51 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 11 2026, 10:42 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics

Mar 10 2026

tappof added a comment to T392886: Revisit default Istio histogram buckets.
tappof@prometheus2007:~$ promtool query instant http://127.0.0.1:9906/k8s "sum by (le) (istio_response_bytes_bucket)" | sed -nr 's/\{le="(.*)"\} => ([0-9]+) .*$/\1 \2/p' | sort -V -r -k 1 |  awk 'NR==1 {first=$2} {printf "%s %s %.10f\n", $1, $2, $2/first}' | column -t
+Inf     7477388827  1.0000000000
3600000  7477379700  0.9999987794
1800000  7477000001  0.9999479998
600000   7470386524  0.9990635363
300000   7424856278  0.9929744794
60000    7260135818  0.9709453375
30000    7097147065  0.9491477880
10000    6688645269  0.8945161772
5000     5840973881  0.7811515512
2500     5532494934  0.7398966487
1000     443906297   0.0593664857
500      31369601    0.0041952614
250      8474769     0.0011333862
100      203514      0.0000272173
50       197156      0.0000263670
25       197156      0.0000263670
10       197156      0.0000263670
5        197156      0.0000263670
1        197156      0.0000263670
0.5      197156      0.0000263670
Mar 10 2026, 4:37 PM · ServiceOps new, SRE Observability (FY2025/2026-Q1), Patch-For-Review, Observability-Metrics
tappof added a comment to T392886: Revisit default Istio histogram buckets.
Searched for `istio_response_bytes_bucket` and found 0 matching dashboards and 0 matching alerts.
Mar 10 2026, 3:48 PM · ServiceOps new, SRE Observability (FY2025/2026-Q1), Patch-For-Review, Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 10 2026, 9:19 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 10 2026, 9:17 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 10 2026, 9:16 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 10 2026, 9:13 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof updated the task description for T419430: Migrate prometheus4002 to prometheus4003.
Mar 10 2026, 9:00 AM · SRE Observability (FY2025/2026-Q3), Observability-Metrics

Mar 9 2026

tappof created T419430: Migrate prometheus4002 to prometheus4003.
Mar 9 2026, 2:15 PM · SRE Observability (FY2025/2026-Q3), Observability-Metrics

Mar 4 2026

tappof closed T417900: Serve something helpful at metamonitoring.wikimedia.org as Resolved.

metamonitoring.wikimedia.org now redirects to the Wikitech meta-monitoring documentation.

Mar 4 2026, 2:33 PM · Observability-Alerting

Mar 3 2026

tappof added a comment to T410835: ErrorBudgetBurn.

A quick update on the gaps caused by the issue described in the previous comments: over the past week, a patch has been deployed that should prevent similar gaps in the future.

Mar 3 2026, 2:17 PM · Test Kitchen (Test Kitchen (Experiment Platform Sprint 22))
tappof added a comment to T349521: Prometheus/Pyrra: establish backfill process for recording rules.

The solution outlined in the diagram has been implemented. It is now possible to test the backfill process with the new configuration.

Mar 3 2026, 2:08 PM · SRE-SLO, Patch-For-Review, User-herron, Observability-Metrics
tappof closed T412924: Multi-instance thanos store gateway, a subtask of T349521: Prometheus/Pyrra: establish backfill process for recording rules, as Resolved.
Mar 3 2026, 2:06 PM · SRE-SLO, Patch-For-Review, User-herron, Observability-Metrics
tappof closed T412924: Multi-instance thanos store gateway, a subtask of T396862: Improve titan hosts stateless-ness, as Resolved.
Mar 3 2026, 2:06 PM · Observability-Metrics
tappof closed T412924: Multi-instance thanos store gateway, a subtask of T410835: ErrorBudgetBurn, as Resolved.
Mar 3 2026, 2:06 PM · Test Kitchen (Test Kitchen (Experiment Platform Sprint 22))
tappof closed T412924: Multi-instance thanos store gateway as Resolved.
Mar 3 2026, 2:06 PM · SRE Observability (FY2025/2026-Q3), Observability-Metrics
tappof closed T396862: Improve titan hosts stateless-ness as Resolved.
Mar 3 2026, 2:05 PM · Observability-Metrics
tappof added a comment to T396862: Improve titan hosts stateless-ness.

Option 4 has been implemented. Titan hosts can now be considered stateless, provided there is a pause of at least 2.5 (--tsdb.block-duration=2h (ruler) + --consistency-delay=30m (store)) hours between two reimaging operations.

Mar 3 2026, 1:57 PM · Observability-Metrics
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..

This should be the last broken compactor iteration related to the codfw blocks (replica c).

Mar 3 2026, 10:28 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-03T06:25:55.232537053Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-03T06:25:55.623288284Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1CC2FQBMKKNRA7SPK0FDBC
ts=2026-03-03T06:25:55.623361839Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1CC2FQBMKKNRA7SPK0FDBC
ts=2026-03-03T06:25:55.623409312Z caller=main.go:174 level=info msg=exiting
ts=2026-03-03T06:25:55.666497203Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-03T06:25:56.16805841Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1HB2S5TBSBRGD53A1J7AM9
ts=2026-03-03T06:25:56.168098636Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1HB2S5TBSBRGD53A1J7AM9
ts=2026-03-03T06:25:56.16812536Z caller=main.go:174 level=info msg=exiting
ts=2026-03-03T06:25:56.194348717Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-03T06:25:56.666284765Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1MHDZXNGHCX0R5H8WC31TH
ts=2026-03-03T06:25:56.666323796Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1MHDZXNGHCX0R5H8WC31TH
ts=2026-03-03T06:25:56.666360343Z caller=main.go:174 level=info msg=exiting
Mar 3 2026, 6:26 AM · SRE Observability (FY2025/2026-Q3)

Mar 2 2026

tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-02T21:33:01.323671649Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T21:33:01.774889687Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ4VER1ESSQ1T7Z8Y3BCG4RJ
ts=2026-03-02T21:33:01.774923408Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ4VER1ESSQ1T7Z8Y3BCG4RJ
ts=2026-03-02T21:33:01.774952795Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T21:33:01.80030355Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T21:33:02.222816651Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNXJ83J001YMTF54341C96Z
ts=2026-03-02T21:33:02.222859001Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNXJ83J001YMTF54341C96Z
ts=2026-03-02T21:33:02.222888483Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T21:33:02.251554427Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T21:33:02.697278054Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ0A99ZE94Y7KF1R4R0W0YBB
ts=2026-03-02T21:33:02.697313663Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ0A99ZE94Y7KF1R4R0W0YBB
ts=2026-03-02T21:33:02.697341398Z caller=main.go:174 level=info msg=exiting
Mar 2 2026, 9:33 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-02T17:31:09.949861362Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T17:31:10.417103438Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ4NCHEMZVJ1HM8WJ7E9YMG0
ts=2026-03-02T17:31:10.417142766Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ4NCHEMZVJ1HM8WJ7E9YMG0
ts=2026-03-02T17:31:10.417184751Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T17:31:10.44386891Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T17:31:10.870063142Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ4QMKJ6FWENPXBS1VS31S8D
ts=2026-03-02T17:31:10.870099103Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ4QMKJ6FWENPXBS1VS31S8D
ts=2026-03-02T17:31:10.870135237Z caller=main.go:174 level=info msg=exiting
Mar 2 2026, 5:31 PM · SRE Observability (FY2025/2026-Q3)
tappof updated subscribers of T418118: SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100.

Thank you @bd808.
After a discussion with the infra-foundations team (@MoritzMuehlenhoff), we’re going to apply a patch that removes users with invalid metadata from the Grafana DB.

Mar 2 2026, 3:00 PM · SRE Observability (FY2025/2026-Q4)
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..

This should be the last broken compactor iteration related to the eqiad blocks.

Mar 2 2026, 11:55 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite/analyze# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-02T08:29:38.442184885Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T08:29:38.826132439Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ15PF4N8QYCYR1XC4S7BRTX
ts=2026-03-02T08:29:38.826166419Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ15PF4N8QYCYR1XC4S7BRTX
ts=2026-03-02T08:29:38.826194795Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T08:29:38.851857477Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T08:29:39.218055046Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ8G05G55J7SZWNFQ0Y5HGS9
ts=2026-03-02T08:29:39.218094806Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ8G05G55J7SZWNFQ0Y5HGS9
ts=2026-03-02T08:29:39.218139649Z caller=main.go:174 level=info msg=exiting
Mar 2 2026, 8:29 AM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite/analyze# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-02T06:42:36.168235556Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T06:42:36.545903125Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ4YDT4HZJXKJZ3EEGC437WF
ts=2026-03-02T06:42:36.545959867Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ4YDT4HZJXKJZ3EEGC437WF
ts=2026-03-02T06:42:36.545996126Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T06:42:36.574658567Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T06:42:37.102393159Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ51M4EWCX81YHABTHFDPESW
ts=2026-03-02T06:42:37.102466332Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ51M4EWCX81YHABTHFDPESW
ts=2026-03-02T06:42:37.102502168Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T06:42:37.157683698Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T06:42:37.532416659Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ548G41Z5R14Q6SCR5HPQGV
ts=2026-03-02T06:42:37.532451323Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ548G41Z5R14Q6SCR5HPQGV
ts=2026-03-02T06:42:37.532481587Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T06:42:37.563926691Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T06:42:37.9577184Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ55FZR96MJR9KTGMXX3CR43
ts=2026-03-02T06:42:37.957763384Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ55FZR96MJR9KTGMXX3CR43
ts=2026-03-02T06:42:37.957812304Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T06:42:37.991343149Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T06:42:38.434325513Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1WMF8D1T91AX1XGGWPDBVZ
ts=2026-03-02T06:42:38.434362327Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1WMF8D1T91AX1XGGWPDBVZ
ts=2026-03-02T06:42:38.434393701Z caller=main.go:174 level=info msg=exiting
ts=2026-03-02T06:42:38.460906072Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-02T06:42:38.886975344Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ205MMB62D0DTP0GEDN9X0B
ts=2026-03-02T06:42:38.88701105Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ205MMB62D0DTP0GEDN9X0B
ts=2026-03-02T06:42:38.887042683Z caller=main.go:174 level=info msg=exiting
Mar 2 2026, 6:43 AM · SRE Observability (FY2025/2026-Q3)

Mar 1 2026

tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite/analyze# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-01T20:47:05.905841876Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-01T20:47:06.300001961Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ2P5XHHN1VPFC7T0EJEC5HJ
ts=2026-03-01T20:47:06.300052731Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ2P5XHHN1VPFC7T0EJEC5HJ
ts=2026-03-01T20:47:06.300091735Z caller=main.go:174 level=info msg=exiting
ts=2026-03-01T20:47:06.330211279Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-01T20:47:06.724717256Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ780CRES6SNXJCZ0YD3ESR7
ts=2026-03-01T20:47:06.724796535Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ780CRES6SNXJCZ0YD3ESR7
ts=2026-03-01T20:47:06.724850561Z caller=main.go:174 level=info msg=exiting
Mar 1 2026, 8:47 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite/analyze# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-03-01T14:23:43.292254663Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-01T14:23:43.666672708Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1EV4C6Z07KQXTGQQHAD3RN
ts=2026-03-01T14:23:43.666707023Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1EV4C6Z07KQXTGQQHAD3RN
ts=2026-03-01T14:23:43.666749194Z caller=main.go:174 level=info msg=exiting
ts=2026-03-01T14:23:43.693997341Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-01T14:23:44.163264322Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1GC67GXKVP0ANXVKDGT8NG
ts=2026-03-01T14:23:44.163301376Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1GC67GXKVP0ANXVKDGT8NG
ts=2026-03-01T14:23:44.163341381Z caller=main.go:174 level=info msg=exiting
ts=2026-03-01T14:23:44.190780813Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-03-01T14:23:44.575853778Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ1KHC542JVZNMJGDZY8HK3E
ts=2026-03-01T14:23:44.575890438Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ1KHC542JVZNMJGDZY8HK3E
ts=2026-03-01T14:23:44.575916649Z caller=main.go:174 level=info msg=exiting
Mar 1 2026, 2:24 PM · SRE Observability (FY2025/2026-Q3)

Feb 27 2026

tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
Feb 27 18:29:03 titan2001 thanos-compact[2355412]: ts=2026-02-27T18:29:03.633560685Z caller=compact.go:559 level=error msg="critical error detected; halting" err="compaction: group 300000@11257394797428657513: compact blocks [/srv/thanos-compact/compact/300000@11257394797428657513/01KJ0TR72JG689HJEXA673THNG /srv/thanos-compact/compact/300000@11257394797428657513/01KJ0Y3224GPA2909VM6S7N5DC /srv/thanos-compact/compact/300000@11257394797428657513/01KJ10PAEMVAAN5N8Y89HH94AS]: 2 errors: add series: symbol table size exceeds 4294967295 bytes: 6895277120; symbol table size exceeds 4294967295 bytes: 6895277120"
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-02-27T20:21:01.940712842Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T20:21:02.641843029Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ0TR72JG689HJEXA673THNG
ts=2026-02-27T20:21:02.641886613Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ0TR72JG689HJEXA673THNG
ts=2026-02-27T20:21:02.641919232Z caller=main.go:174 level=info msg=exiting
ts=2026-02-27T20:21:02.669050969Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T20:21:03.124292224Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ0Y3224GPA2909VM6S7N5DC
ts=2026-02-27T20:21:03.124327738Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ0Y3224GPA2909VM6S7N5DC
ts=2026-02-27T20:21:03.124365468Z caller=main.go:174 level=info msg=exiting
ts=2026-02-27T20:21:03.15273297Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T20:21:03.552568751Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ10PAEMVAAN5N8Y89HH94AS
ts=2026-02-27T20:21:03.552609163Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ10PAEMVAAN5N8Y89HH94AS
ts=2026-02-27T20:21:03.552657589Z caller=main.go:174 level=info msg=exiting
Feb 27 2026, 8:21 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-02-27T17:12:16.619687414Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T17:12:17.065982417Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ0A98QFVAERZ2SW7PRK7GDM
ts=2026-02-27T17:12:17.066018886Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ0A98QFVAERZ2SW7PRK7GDM
ts=2026-02-27T17:12:17.066057302Z caller=main.go:174 level=info msg=exiting
ts=2026-02-27T17:12:17.093581749Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T17:12:17.54114441Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ0K2R2ZSSRKXYAQXPW2JH8F
ts=2026-02-27T17:12:17.541182669Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ0K2R2ZSSRKXYAQXPW2JH8F
ts=2026-02-27T17:12:17.541213451Z caller=main.go:174 level=info msg=exiting
ts=2026-02-27T17:12:17.569040513Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T17:12:18.031713342Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ0Q8GGTTE4WGMM1YDAY5KTX
ts=2026-02-27T17:12:18.031744313Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ0Q8GGTTE4WGMM1YDAY5KTX
ts=2026-02-27T17:12:18.031776338Z caller=main.go:174 level=info msg=exiting
ts=2026-02-27T17:12:18.060355116Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-27T17:12:18.433566644Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ4TTBYXCV65H4K0VSDXF6NS
ts=2026-02-27T17:12:18.433611829Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ4TTBYXCV65H4K0VSDXF6NS
ts=2026-02-27T17:12:18.433653121Z caller=main.go:174 level=info msg=exiting
Feb 27 2026, 5:12 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..

This should be the last broken compactor iteration related to the codfw blocks.

Feb 27 2026, 10:38 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Feb 26 2026

tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-02-26T13:54:45.623877384Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-26T13:54:45.988765229Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ4K8DNY6HD7TR9QCYN1BY7N
ts=2026-02-26T13:54:45.988806532Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ4K8DNY6HD7TR9QCYN1BY7N
ts=2026-02-26T13:54:45.988843464Z caller=main.go:174 level=info msg=exiting
ts=2026-02-26T13:54:46.017570188Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-26T13:54:46.440448059Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KJ9R9905WSTRMRQV89M0C6HP
ts=2026-02-26T13:54:46.440486326Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KJ9R9905WSTRMRQV89M0C6HP
ts=2026-02-26T13:54:46.440514641Z caller=main.go:174 level=info msg=exiting
Feb 26 2026, 1:55 PM · SRE Observability (FY2025/2026-Q3)
tappof updated the task description for T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
Feb 26 2026, 10:17 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof updated the task description for T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
Feb 26 2026, 10:11 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
| 01KA6AS6271C8KG2M1ZA738FR0 | 2025-11-14T00:00:00Z | 2025-11-16T00:00:00Z | 47h59m59.996s  | 192h0m0.004s    | 151,622,694 | 5,751,403,580   | 184,242,897   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KDCH6GFJS9W0JEJ5MTM5VB4P | 2025-12-04T00:00:00Z | 2025-12-18T00:00:00Z | 336h0m0s       | -96h0m0s        | 198,221,986 | 37,160,450,097  | 436,284,328   | 5          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K8X0MMJEKVDJDJDFECBQWXAQ | 2025-10-29T00:00:00Z | 2025-10-31T00:00:00Z | 48h0m0s        | 192h0m0s        | 211,139,192 | 5,651,423,944   | 236,514,767   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K976V9SNBFFH1T7TJE7SN0BT | 2025-11-02T00:00:00Z | 2025-11-04T00:00:00Z | 48h0m0s        | 192h0m0s        | 193,082,703 | 5,888,466,707   | 222,167,968   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K9HK4CHAT3GYR9PE89QHBC55 | 2025-11-06T00:00:00Z | 2025-11-08T00:00:00Z | 48h0m0s        | 192h0m0s        | 204,305,672 | 5,627,819,019   | 229,637,393   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KABCZJ0MNN1MXRFZJ24ZNFZY | 2025-11-16T00:00:00Z | 2025-11-18T00:00:00Z | 48h0m0s        | 192h0m0s        | 176,874,990 | 5,790,570,910   | 205,820,885   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KAKDEM6AP69B927D2A44QYJJ | 2025-11-18T00:00:00Z | 2025-11-20T00:00:00Z | 48h0m0s        | 192h0m0s        | 201,603,910 | 5,622,290,795   | 225,214,175   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K91X5390HMNQB9ZNGCFHDG7X | 2025-10-31T00:00:00Z | 2025-11-02T00:00:00Z | 48h0m0s        | 192h0m0s        | 159,030,050 | 5,861,418,083   | 190,112,659   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KBA9JK1CJHY75EHT2DKZ22DH | 2025-11-28T00:00:00Z | 2025-11-30T00:00:00Z | 48h0m0s        | 192h0m0s        | 183,356,980 | 5,760,997,101   | 213,853,610   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K9VVDSYAPE2BKCMRCNQVRHV6 | 2025-11-10T00:00:00Z | 2025-11-12T00:00:00Z | 48h0m0s        | 192h0m0s        | 219,728,127 | 5,713,467,752   | 245,755,289   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KA184HVK9A9QD56FSN6RGX0V | 2025-11-12T00:00:00Z | 2025-11-14T00:00:00Z | 48h0m0s        | 192h0m0s        | 213,957,471 | 5,715,567,583   | 239,328,668   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KATGGGE064ENGTE6QZMQM3MG | 2025-11-22T08:00:00Z | 2025-11-24T00:00:00Z | 40h0m0s        | 200h0m0s        | 161,489,041 | 4,853,231,487   | 186,523,829   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KBFT8DXQ0C1AT4SG8VBKH92Z | 2025-11-30T00:00:00Z | 2025-12-02T00:00:00Z | 47h59m59.999s  | 192h0m0.001s    | 199,452,329 | 5,746,028,369   | 227,826,093   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K9PPNKW5PGEWJA5TDAMFSPQG | 2025-11-08T00:00:00Z | 2025-11-10T00:00:00Z | 48h0m0s        | 192h0m0s        | 171,689,395 | 5,832,841,392   | 205,076,390   | 3          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01K7F4JXDH48EK51FQP0XRV049 | 2025-09-25T00:00:00Z | 2025-10-03T00:00:00Z | 191h59m59.999s | 48h0m0.001s     | 101,697,918 | 17,802,669,172  | 234,126,757   | 4          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
| 01KDXXQZGRSA8K3SD0RHMQKX65 | 2025-12-18T00:00:00Z | 2026-01-01T00:00:00Z | 336h0m0s       | -96h0m0s        | 40,350,938  | 35,238,639,327  | 277,409,778   | 4          | false       | prometheus=k8s,replica=d,site=eqiad           | 5m0s       | compactor      |
Feb 26 2026, 8:45 AM · SRE Observability (FY2025/2026-Q4), Observability-Metrics

Feb 23 2026

tappof added a subtask for T390194: Add read-only users capability to logs-api.svc: T418158: LDAP based access to logs-api.svc.
Feb 23 2026, 4:58 PM · SRE Observability (FY2024/2025-Q4)
tappof added a parent task for T418158: LDAP based access to logs-api.svc: T390194: Add read-only users capability to logs-api.svc.
Feb 23 2026, 4:58 PM · SRE Observability
tappof created T418158: LDAP based access to logs-api.svc.
Feb 23 2026, 4:58 PM · SRE Observability
tappof edited projects for T418118: SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100, added: SRE Observability; removed Observability-Alerting.
Feb 23 2026, 4:50 PM · SRE Observability (FY2025/2026-Q4)
tappof added a comment to T416501: Grant sbassett, aranyap, and alexsanford expanded logstash access.

Just sent @ASanford-WMF a link on Slack with his credentials.
You’ve just been added, so please wait a while for the Puppet agent to run.

Feb 23 2026, 3:25 PM · FY2025-26 WE 4.6 - Account Security, Observability-Logging, Wikimedia-Logstash, Security-Team
tappof edited projects for T418118: SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100, added: Observability-Alerting; removed Security.
Feb 23 2026, 11:47 AM · SRE Observability (FY2025/2026-Q4)
tappof added a parent task for T418118: SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100: Unknown Object (Task).
Feb 23 2026, 11:47 AM · SRE Observability (FY2025/2026-Q4)
tappof added a project to T418118: SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100: Security.
Feb 23 2026, 11:46 AM · SRE Observability (FY2025/2026-Q4)
tappof created T418118: SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100.
Feb 23 2026, 11:27 AM · SRE Observability (FY2025/2026-Q4)

Feb 20 2026

tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
| 01K6MV8BJ97HSTK0KXS3V5KM0G | 2025-10-01T00:00:00Z | 2025-10-03T00:00:00Z | 48h0m0s        | 192h0m0s        | 168,928,710 | 7,258,037,239   | 200,667,949   | 3          | false       | prometheus=k8s,replica=c,site=codfw           | 5m0s       | compactor      |
| 01K7KQJQ4X6NVZ1A3T2XY1EMHA | 2025-10-13T00:00:00Z | 2025-10-15T00:00:00Z | 48h0m0s        | 192h0m0s        | 121,223,839 | 7,605,485,057   | 152,452,711   | 3          | false       | prometheus=k8s,replica=c,site=codfw           | 5m0s       | compactor      |
Feb 20 2026, 9:11 PM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-02-20T16:51:13.201800186Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T16:51:13.581468524Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNDZTK3CAT0TSSKJRSXZ89G
ts=2026-02-20T16:51:13.581503495Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNDZTK3CAT0TSSKJRSXZ89G
ts=2026-02-20T16:51:13.581532903Z caller=main.go:174 level=info msg=exiting
ts=2026-02-20T16:51:13.607222033Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T16:51:14.139269904Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNEWE5GDC37RST2V56K78D2
ts=2026-02-20T16:51:14.139300091Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNEWE5GDC37RST2V56K78D2
ts=2026-02-20T16:51:14.139330405Z caller=main.go:174 level=info msg=exiting
ts=2026-02-20T16:51:14.165868253Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T16:51:14.661128939Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNR8R0EES7E62223BPN38AQ
ts=2026-02-20T16:51:14.661164041Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNR8R0EES7E62223BPN38AQ
ts=2026-02-20T16:51:14.661196397Z caller=main.go:174 level=info msg=exiting
ts=2026-02-20T16:51:14.689985372Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T16:51:15.128202048Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNV75VJRYR87QAJ2NH5037F
ts=2026-02-20T16:51:15.128243551Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNV75VJRYR87QAJ2NH5037F
ts=2026-02-20T16:51:15.128284904Z caller=main.go:174 level=info msg=exiting
Feb 20 2026, 4:51 PM · SRE Observability (FY2025/2026-Q3)
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:~# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-02-20T12:49:44.149724742Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T12:49:44.619755808Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01K7F5CBWE3ZHQCTTF73P3HWX6
ts=2026-02-20T12:49:44.619797048Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01K7F5CBWE3ZHQCTTF73P3HWX6
ts=2026-02-20T12:49:44.619823908Z caller=main.go:174 level=info msg=exiting
ts=2026-02-20T12:49:44.648215995Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T12:49:45.100389852Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHN48FYBT9QHPS55WTV2S6TY
ts=2026-02-20T12:49:45.100425937Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHN48FYBT9QHPS55WTV2S6TY
ts=2026-02-20T12:49:45.100454824Z caller=main.go:174 level=info msg=exiting
ts=2026-02-20T12:49:45.129712007Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T12:49:45.595616363Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHN6V3XVKN3C4TC5QNEBGA6K
ts=2026-02-20T12:49:45.595652586Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHN6V3XVKN3C4TC5QNEBGA6K
ts=2026-02-20T12:49:45.595688915Z caller=main.go:174 level=info msg=exiting
ts=2026-02-20T12:49:45.620688711Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-20T12:49:45.996734008Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNA8YSTEHMS71R3EGF2VWW9
ts=2026-02-20T12:49:45.996794513Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNA8YSTEHMS71R3EGF2VWW9
ts=2026-02-20T12:49:45.996839927Z caller=main.go:174 level=info msg=exiting
Feb 20 2026, 12:50 PM · SRE Observability (FY2025/2026-Q3)

Feb 19 2026

tappof added a comment to T417742: ThanosCompactHalted: pre compaction overlap check - overlaps found while gathering blocks..
| 01K7AMB8ATGA1VWRQDGKFSWJ8H | 2025-10-09T00:00:00Z | 2025-10-11T00:00:00Z | 48h0m0s        | 192h0m0s        | 208,643,557 | 5,327,000,979   | 234,145,856   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K7EHKBMTGYGZFGM73A4ZSK6S | 2025-10-11T00:00:00Z | 2025-10-13T00:00:00Z | 48h0m0s        | 192h0m0s        | 191,557,596 | 5,510,522,469   | 222,575,557   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K8B33W5AQSDBCVN6EHVDXAAQ | 2025-10-14T16:00:00Z | 2025-10-16T08:00:00Z | 40h0m0s        | 200h0m0s        | 194,021,410 | 4,446,615,486   | 210,078,804   | 4          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K7XZJKNCVNJGF55VRQYNY66P | 2025-10-17T00:00:00Z | 2025-10-19T00:00:00Z | 48h0m0s        | 192h0m0s        | 194,899,690 | 5,561,134,464   | 225,705,814   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K8JPSNXDXS52C5BCY5FKWP9V | 2025-10-25T00:00:00Z | 2025-10-27T00:00:00Z | 48h0m0s        | 192h0m0s        | 176,120,189 | 5,546,808,468   | 204,910,864   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K6ZQ3J7SS9THAV7MXXKK4BZJ | 2025-10-03T00:00:00Z | 2025-10-05T00:00:00Z | 48h0m0s        | 192h0m0s        | 164,986,174 | 5,271,999,399   | 189,193,155   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K709YT3YFGK8SWTS88VPWEQR | 2025-10-05T00:00:00Z | 2025-10-07T00:00:00Z | 48h0m0s        | 192h0m0s        | 192,851,444 | 5,436,891,797   | 218,463,283   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
| 01K77MGF06S6Y8X2EVDVRXW9Z3 | 2025-10-07T00:00:00Z | 2025-10-09T00:00:00Z | 48h0m0s        | 192h0m0s        | 211,944,289 | 5,362,402,817   | 232,416,457   | 3          | false       | prometheus=k8s,replica=c,site=eqiad           | 5m0s       | compactor      |
Feb 19 2026, 9:29 PM · SRE Observability (FY2025/2026-Q4), Observability-Metrics
tappof claimed T417900: Serve something helpful at metamonitoring.wikimedia.org.
Feb 19 2026, 3:15 PM · Observability-Alerting
tappof added a comment to P88892 (An Untitled Masterwork).

not working:

@click.command(
    help="""
Copy all YAML files from a Sloth generate output directory to a flatten directory.
Feb 19 2026, 2:27 PM
tappof added a comment to T416745: ThanosCompactHalted: add series - symbol table size exceeds.
root@titan2001:/srv/rewrite# journalctl -u thanos-compact | grep halt | tail -n 1 | sed -nr 's/^.*\[(.*)\].*$/\1/p' | tr -s ' ' '\n' | awk -F '/' '{print $NF}' | xargs -I % thanos tools bucket --objstore.config-file /etc/thanos-compact/objstore.yaml mark --id=% --marker=no-compact-mark.json --details="compactor halted due to size"
ts=2026-02-19T11:51:58.579038981Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-19T11:51:58.959608381Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNVEAXF5JFCXP5GTCB2CHJ1
ts=2026-02-19T11:51:58.959650877Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNVEAXF5JFCXP5GTCB2CHJ1
ts=2026-02-19T11:51:58.959681114Z caller=main.go:174 level=info msg=exiting
ts=2026-02-19T11:51:58.986092607Z caller=factory.go:54 level=info msg="loading bucket configuration"
ts=2026-02-19T11:51:59.36153301Z caller=block.go:406 level=info msg="block has been marked for no compaction" block=01KHNY2NVN3HQQBQD3CVWNQP8A
ts=2026-02-19T11:51:59.361596817Z caller=tools_bucket.go:1134 level=info msg="marking done" marker=no-compact-mark.json IDs=01KHNY2NVN3HQQBQD3CVWNQP8A
ts=2026-02-19T11:51:59.361628278Z caller=main.go:174 level=info msg=exiting
Feb 19 2026, 11:52 AM · SRE Observability (FY2025/2026-Q3)