User Details
- User Since
- Sep 8 2022, 6:10 AM (188 w, 22 h)
- Availability
- Available
- LDAP User
- Sg912
- MediaWiki User
- SGupta-WMF [ Global Accounts ]
Mon, Apr 6
@Aklapper Thanks for the link , I completed the needful
Feb 20 2026
Wikidata Revert Risk — Scale Test Results 19 Feb
• Total Requests: 161,081
• Successful: 160,743
• Failed: 338
• Actual RPS: 41.94
• Test Duration: 60 minutes
Feb 18 2026
Feb 18 Tests
Throughput
Total Requests: 126,831
Success Rate: 99.84% (126,633 successful, 198 failed)
Actual RPS: 31.99
Requests/Hour: 115,152
Feb 12 2026
Feb 6 2026
Feb 5 2026
@kevinbazira @FNavas-foundation Wikidata Revert Risk Scale Test - February 5, 2026
Test Summary:
Total Requests: 120,443
Successful: 113,230 (94.01%)
Failed: 7,213 (5.99%)
Duration: 65.4 minutes (3,923 seconds)
Actual RPS: 30.70
Requests/Hour: 110,508
Target Achievement: 73.67% of 150K/hour goal
Latency - Successful Calls (n=113,230):
Min: 0.17s
Median: 0.50s
Mean: 1.10s
P90: 2.58s
P95: 4.69s
P99: 8.47s
Distribution:
<500ms: 49.56%
500ms-1s: 27.32%
1s-2s: 10.45%
2s-5s: 8.13%
5s: 4.54%
Feb 4 2026
Feb 3 2026
Feb 2 2026
Jan 28 2026
@kevinbazira Thank you for the suggestion! However, I wanted to clarify a few important points about WME's infrastructure:
WME uses the external endpoint for all LiftWing requests by design. WME is not part of WMF infrastructure - it operates as a separate service outside of the WMF network. This means we don't have access to internal WMF endpoints like https://inference.svc.eqiad.wmnet:30443/.
The external endpoint (https://api.wikimedia.org/service/lw/inference/v1/models/) is the appropriate path for WME's architecture, and we're aware of the additional latency this introduces compared to internal-only services.
Regarding API limits, there are specific rate limit specifications for WME documented in the link you referenced earlier. These limits are different from general public API limits since WME has dedicated capacity allocations.
The performance issues we're seeing (timeouts, unhashable type: 'dict' errors) appear to be related to the service-side processing or the model itself rather than network latency, since:
Some requests are succeeding while others fail with server-side exceptions
The timeout errors suggest the service is taking too long to respond, not just network delay
@FNavas-foundation @HShaikh
Key Takeaways from latest test
- Throughput improved significantly, reaching ~136K requests/hour (≈90% of the 150K/hour target), up from ~68–78K/hour in previous runs.
- Reliability increased, with success rate improving to 93.2% (down from ~85–88%), though error rate remains non-negligible.
- Warm-up latency improved substantially, with first-200 P90 dropping from ~5.7s to ~0.69s.
- Steady-state latency remains high, with overall P90 at ~1.8s and only ~34% of requests completing under 500ms. -The inference service returned TypeError: unhashable type: 'dict' and context deadline exceeded, contributing to request failures and increased tail latency.
Run 3 results -
Jan 27 2026
Jan 23 2026
Jan 21 2026
Jan 15 2026
Jan 13 2026
Jan 12 2026
Jan 7 2026
Jan 6 2026
Dec 18 2025
@kevinbazira We’re planning to run the scale test in the next few days.
@HShaikh
Dec 15 2025
Dec 11 2025
Dec 10 2025
Dec 4 2025
Dec 3 2025
Checked eventstream listener and updater , agreed metrics are available.
Dec 1 2025
Nov 19 2025
Nov 18 2025
Eventstream-listener doesn’t seem to expose any Wikidata-specific metrics. The on-demand service exposes some, but several key metrics still appear to be missing.
Nov 17 2025
Nov 12 2025
Nov 11 2025
WIP : Metrics and Filters , planning to raise a MR tomorrow
Nov 10 2025
Nov 6 2025
@REsquito-WMF Confirmimg, if we do not need kafka readiness check here as suggested by @RThomas-WMF .
MR is up for same.
Raised MR for healthchecks.
Nov 4 2025
Oct 29 2025
MR is up , waiting for reviews on this.
