- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 30 2022
Sep 17 2022
Sep 16 2022
Sep 9 2022
Sep 8 2022
Aug 17 2022
HI @cmassaro, we are thinking of having a benchmarking tool to monitor performance stats. Roughly speaking, do you know any key parts of the function-* services that could use monitoring?
Aug 15 2022
Sure. I'll see where this fits in.
Aug 11 2022
After talking with @cmassaro we decided to re-open this task and undo the mutex+async. Reusing this task for the undo!
Aug 10 2022
Sorry for commenting on such an old ticket, but I was reading on some AJV docs and they seem to indicate the race condition would not happen in our scenario: https://ajv.js.org/faq.html#would-errors-get-overwritten-in-case-of-concurrent-validations. As long as we read the errors in the same execution block as the validate call, which I believe we are/can, they won't be overwritten. Here is a discussion thread about this: https://github.com/ajv-validator/ajv/issues/242. Here is the part of the code where the mutex is used: https://github.com/wikimedia/mediawiki-services-function-schemata/blob/master/javascript/src/schema.js#L472-L473.
@fgiunchedi thanks for the clarification! What would be our recommended course of action, e.g. perhaps to proxy the probes through another way?
Aug 9 2022
Perhaps I am lacking context, but why do we need to pass the logger object around? Could we use separate logger for each class/script?
Update: we are running into some issues with the prometheus blackbox checks where the probes are timing out (10 second). The request should take ~3 seconds to resolve.
Aug 1 2022
Jul 21 2022
Jul 20 2022
I'll finish the last part of this task: Use in function-evaluator.
Jul 10 2022
Hi @Teleosteen, here is an example of where this error would be used: evaluate.js. If you search for this bug ID, you will find two scenarios where this error type is needed: when a function definition specifies a programming language that does not exist, or does not have an executor built-in. I hope this answers your questions. Cheers!
Jul 6 2022
@taavi when you say "beta prometheus isn't hooked to any alerting system", does it mean it's impossible to set up alerting on a beta cluster host, or it just has not been done before?
Oops I created a separate task: https://phabricator.wikimedia.org/T311457. Lemme know if I should merge the two.
Jul 5 2022
Hi @fgiunchedi, thank you for the helpful input! The goal for health monitoring in prod should be similar to that in Beta: we want to make sure the services are up and returning correct responses for basic requests. Since the prod instance will be on kubernetes, the infrastructure we need to achieve the same goal will be different (or so we were told). This is why we are ready to make Beta monitoring a separate effort, since we likely cannot reuse the same setup. I hope that answers your questions!
Jun 27 2022
Jun 13 2022
We'd also like to invite @JMeybohm and @akosiaris to review the wikifunctions observability doc. We'd really appreciate your input and feedback:) Thanks!
Jun 10 2022
As part of the observability goal, we are looking to implement periodic health checks to monitor the uptime of the function orchestrator and evaluator.
May 17 2022
May 5 2022
May 2 2022
Apr 20 2022
Not sure if I got the priority right (I guessed "low"). I just wanted to change the "needs triage" since I accepted the task. Feel free to correct the priority if you see differently.
Apr 15 2022
Hi Andre,