Page MenuHomePhabricator

cmassaro (Cory Massaro)
User

Today

  • No visible events.

Tomorrow

  • No visible events.

Tuesday

  • No visible events.

User Details

User Since
Jan 5 2021, 4:31 PM (256 w, 4 d)
Availability
Available
LDAP User
Cory Massaro
MediaWiki User
CMassaro (WMF) [ Global Accounts ]

Recent Activity

Yesterday

cmassaro moved T409379: create and add a test that counts calls to validation from Ready to Needs Sign-off on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Sat, Dec 6, 10:09 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro claimed T409379: create and add a test that counts calls to validation.
Sat, Dec 6, 10:08 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator

Tue, Dec 2

cmassaro added a comment to T411481: Count all running subtasks of the evaluator when monitoring for process issues, not just those that contain "wasm".

Is this a sub-task of T406848?

Tue, Dec 2, 3:39 PM · Abstract Wikipedia team (26Q2 (Oct–Dec)), Essential-Work, function-evaluator
cmassaro added a subtask for T406848: wide-scale Python failure: T411481: Count all running subtasks of the evaluator when monitoring for process issues, not just those that contain "wasm".
Tue, Dec 2, 3:38 PM · Patch-For-Review, Essential-Work, function-evaluator, Abstract Wikipedia team
cmassaro added a parent task for T411481: Count all running subtasks of the evaluator when monitoring for process issues, not just those that contain "wasm": T406848: wide-scale Python failure.
Tue, Dec 2, 3:38 PM · Abstract Wikipedia team (26Q2 (Oct–Dec)), Essential-Work, function-evaluator
cmassaro added a comment to T396537: Wikifunctions UI: unescape function is not working.

Yes, agreed. This is very much not the right way to implement this. Python has the html library and the implementation should use it!

Tue, Dec 2, 3:00 PM · WikiLambda Front-end, WikiLambda, Abstract Wikipedia team
cmassaro created T411481: Count all running subtasks of the evaluator when monitoring for process issues, not just those that contain "wasm".
Tue, Dec 2, 12:07 PM · Abstract Wikipedia team (26Q2 (Oct–Dec)), Essential-Work, function-evaluator

Sun, Nov 30

cmassaro created T411333: Make function-evaluator tests much more resilient.
Sun, Nov 30, 11:20 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-evaluator

Wed, Nov 26

cmassaro added a comment to T396537: Wikifunctions UI: unescape function is not working.

I tried to run locally in python: print(Z10938("ỳ"))
But that also outputted 'ỳ' and not 'ỳ'.

My thoughts is that the last part of the code should be something like:

def replace_entity(match):
		entity = match.group(0)
		if entity in conversion_table:
			return conversion_table[entity]["characters"]

		decimal_match = re.fullmatch(r'&#(\d+);', entity)
		if decimal_match:
			try:
				return chr(int(decimal_match.group(1)))
			except (ValueError, OverflowError):
				return entity

		hex_match = re.fullmatch(r'&#x([0-9A-Fa-f]+);', entity)
		if hex_match:
			try:
				return chr(int(hex_match.group(1), 16))
			except (ValueError, OverflowError):
				return entity

		return entity

	# Use regular expression to find HTML entities
	pattern = r'&(?:[A-Za-z0-9]+|#\d+|#x[0-9A-Fa-f]+);'
	result = re.sub(pattern, replace_entity, Z10938K1)
	return result

intmatch = re.search(r'&(\d+);', entity) was meant to catch decimal numeric entities (like Ӓ). However, because the outer regex r'&[A-Za-z0-9]+;' never matched strings containing #, the function was never called with numeric entities, so intmatch never had a chance to run. It also couldn’t handle hex forms (ỳ). After updating the main regex to include #\d+ and #x..., and tightening intmatch to a fullmatch, those numeric entities now get decoded properly.

Wed, Nov 26, 6:05 PM · WikiLambda Front-end, WikiLambda, Abstract Wikipedia team
cmassaro added a comment to T314953: Refine reporting of memory / CPU usage of orchestrator and evaluator.

I don't think we're likely to make changes in this area. The reporting of these numbers doesn't touch any of our observability or metrics; it's for display purposes and to give a rough estimate of resource consumption. We can't do this effectively without putting userland dev tools into the Docker image, which is maybe not the best idea. And handling concurrency/resource allocation will never be our job--that's up to k8s, which we have very little say over. I think we can close this one.

Wed, Nov 26, 3:29 PM · Abstract Wikipedia team
cmassaro updated subscribers of T311767: Add Tests for Failure Cases for APIFunctionCall.php.

Again, this is an extremely old task that probably has little relevance to the current reality. I see no mention of this task in any TODOs in the Wikilambda code.

Wed, Nov 26, 3:26 PM · Abstract Wikipedia Fix-It tasks, Abstract Wikipedia team, WikiLambda
cmassaro added a comment to T294289: Built-in Validator Functions should return validationStatus's Z5s if Schema Validation Fails.

I have no idea what this bug was about anymore, hahaha. It's probably subsumed by the error advocacy work. I think we can close it as stale.

Wed, Nov 26, 3:25 PM · Abstract Wikipedia team, Abstract Wikipedia Fix-It tasks, function-orchestrator

Mon, Nov 24

cmassaro added a comment to T407503: Verify that the Pyrra dashboard is measuring what we think it is, and what it should be measuring.

We've now deployed with the slightly reduced timeout. I hope to see the SLO number at 100% in the coming days, but let's see.

@Jdforrester-WMF @cmassaro one thing that I am wondering - what is the HTTP response code that Wikifunctions returns when a request hits the 10s timeout?

I believe it's a 504.

Okok and just to be sure, this will be displayed as HTTP 200 by mediawiki_WikiLambda_mw_to_orchestrator_api_call_seconds_bucket right?

Looks like the new timeout value is still too generous. Might need to bump it down to 9s. @Jdforrester-WMF , what do you think?

While checking https://grafana.wikimedia.org/goto/G0VG73mDR?orgId=1 I didn't see any evidence of this, what metric are you checking? Not opposed to the change to 9s, just wanted to know the rationale to better understand the process :)

Mon, Nov 24, 5:38 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))

Sun, Nov 23

cmassaro closed T408823: Fail Early if Evaluator Can't Acquire a WASI Runner as Resolved.
Sun, Nov 23, 1:31 AM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro closed T408823: Fail Early if Evaluator Can't Acquire a WASI Runner, a subtask of T406848: wide-scale Python failure, as Resolved.
Sun, Nov 23, 1:31 AM · Patch-For-Review, Essential-Work, function-evaluator, Abstract Wikipedia team
cmassaro closed T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool, a subtask of T408823: Fail Early if Evaluator Can't Acquire a WASI Runner, as Resolved.
Sun, Nov 23, 1:31 AM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro closed T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool as Resolved.
Sun, Nov 23, 1:31 AM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator

Thu, Nov 20

cmassaro added a comment to T407503: Verify that the Pyrra dashboard is measuring what we think it is, and what it should be measuring.

Looks like the new timeout value is still too generous. Might need to bump it down to 9s. @Jdforrester-WMF , what do you think?

Thu, Nov 20, 7:55 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))

Wed, Nov 19

cmassaro added a comment to T407503: Verify that the Pyrra dashboard is measuring what we think it is, and what it should be measuring.

We've now deployed with the slightly reduced timeout. I hope to see the SLO number at 100% in the coming days, but let's see.

Wed, Nov 19, 11:36 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro created T410493: Unhandled fetchPromise error for ZIDs.
Wed, Nov 19, 11:07 AM · Essential-Work, function-orchestrator, Abstract Wikipedia team

Thu, Nov 13

cmassaro updated the task description for T410065: Store Enriched Objects in Memcached.
Thu, Nov 13, 5:12 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro created T410065: Store Enriched Objects in Memcached.
Thu, Nov 13, 5:09 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator

Wed, Nov 12

cmassaro moved T407791: Banish the ? Operator in Backend JS from In Engineering to Needs Sign-off on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Wed, Nov 12, 3:25 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Abstract Wikipedia Fix-It tasks, function-orchestrator
cmassaro added a comment to T407791: Banish the ? Operator in Backend JS.

I declare this banished!

Wed, Nov 12, 3:24 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Abstract Wikipedia Fix-It tasks, function-orchestrator
cmassaro added a comment to T407503: Verify that the Pyrra dashboard is measuring what we think it is, and what it should be measuring.

Exactly yes, it is here. We also have an extra envoy timeout set to 15s IIUC as extra fence, but the one that counts is the orchestrator's one.

Wed, Nov 12, 3:04 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))

Tue, Nov 11

cmassaro added a comment to T407503: Verify that the Pyrra dashboard is measuring what we think it is, and what it should be measuring.

@cmassaro I think it is probably something in the docker image / WF service itself, I haven't found a k8s configuration that triggers the 10s timeout yet.

I reviewed T392886 for the Istio metrics, and changing the buckets below 10s (like adding more when you'll be ready to lower the threshold in the SLO etc..) is not super easy, so it is definitely quicker to do it in mediawiki_WikiLambda_mw_to_orchestrator_api_call_seconds_bucket if you need a special time granularity.

Tue, Nov 11, 8:03 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))

Mon, Nov 10

cmassaro added a comment to T390226: Consider using list inputs to create variadic apply, filter, map, etc. functions.

I feel a little queasy about a fallback mechanism that implicitly changes what gets put into the scope. Consider a case like this: let's say we have a function call

Mon, Nov 10, 9:48 PM · function-schemata, function-orchestrator, Abstract Wikipedia team
cmassaro added a subtask for T409592: Re-enable Rust tests: T388981: In our Rust POC of the function-evaluator, switch from wasi-common to wasmtime-wasi and upgrade wasmtime to 21.0.2+ (or replace).
Mon, Nov 10, 6:33 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Abstract Wikipedia Fix-It tasks, function-evaluator
cmassaro added a parent task for T388981: In our Rust POC of the function-evaluator, switch from wasi-common to wasmtime-wasi and upgrade wasmtime to 21.0.2+ (or replace): T409592: Re-enable Rust tests.
Mon, Nov 10, 6:33 PM · Abstract Wikipedia team (26Q2 (Oct–Dec)), Essential-Work, function-evaluator
cmassaro added a comment to T407503: Verify that the Pyrra dashboard is measuring what we think it is, and what it should be measuring.

After a chat with the AW team last week I tried to follow up again on https://gerrit.wikimedia.org/r/c/operations/puppet/+/1192609, and I may have got what James is trying to do. I'll write down my understanding:

  • The metric that we are using in Pyrra, mediawiki_WikiLambda_mw_to_orchestrator_api_call_seconds_bucket, is generated by MediaWiki and it takes into account the time took by the Wikifunctions service on k8s plus the time that it takes for the response to get back to MediaWiki.
  • The Wikifunctions k8s service is set to take at most 10s to render a call/request.
  • The Back-end API combined latency-availability SLI states that we are counting the requests that take more that 10s to be rendered.

So at the moment the SLI metric that we are using may count requests hitting 10s to render on the k8s backend 10s+x, with x equal to some non-negligible amount of ms that MediaWiki takes to get the reply back from k8s. If we could avoid this extra ms in the SLI metric we'd probably see a nice and green SLO. An option would be to use a different set of SLI metrics, namely the Istio ones (Istio is the gateway that sits between envoy/wf-pods and MediaWiki). For example, this is the dashboard and this is the p99 graph.

The main problem that I see though is respecting the following SLO:

The percentage of all requests that complete within the 10s threshold and receive a non-error response, defined as above, shall be at least the limit.

10 s limit enforced by k8s request logic.

If K8s enforces a maximum 10s timeout, IIUC this means that no request could be possibly logged taking more than 10s by how the service is configured. Hence the SLO should be solid 100% green, but what are we measuring? My understanding is that we should keep the number of 10s timeouts as low as possible, that is what the current SLI is measuring. Am I missing anything?

Mon, Nov 10, 6:11 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))

Sun, Nov 9

cmassaro added a comment to T390226: Consider using list inputs to create variadic apply, filter, map, etc. functions.

Ah, good to know! I figured it was possible to do in the composition language, but not necessarily easy or efficient. If there's already a recipe for this that works without too much trouble, good 😄 .

Sun, Nov 9, 1:51 PM · function-schemata, function-orchestrator, Abstract Wikipedia team

Sat, Nov 8

cmassaro moved T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool from In Engineering to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Sat, Nov 8, 10:11 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro moved T408823: Fail Early if Evaluator Can't Acquire a WASI Runner from In Engineering to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Sat, Nov 8, 10:10 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro moved T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners from Needs Sign-off to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Sat, Nov 8, 10:10 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro moved T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable from In Engineering to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Sat, Nov 8, 10:10 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro claimed T409592: Re-enable Rust tests.
Sat, Nov 8, 10:09 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Abstract Wikipedia Fix-It tasks, function-evaluator
cmassaro added a comment to T390226: Consider using list inputs to create variadic apply, filter, map, etc. functions.

This should be very quick to implement. We'd need to

Sat, Nov 8, 10:09 PM · function-schemata, function-orchestrator, Abstract Wikipedia team

Fri, Nov 7

cmassaro added a subtask for T402957: [26Q2] Rust evaluator in production: T409592: Re-enable Rust tests.
Fri, Nov 7, 8:58 PM · Essential-Work, function-evaluator, Abstract Wikipedia team (26Q2 (Oct–Dec)), Epic
cmassaro added a parent task for T409592: Re-enable Rust tests: T402957: [26Q2] Rust evaluator in production.
Fri, Nov 7, 8:58 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Abstract Wikipedia Fix-It tasks, function-evaluator
cmassaro created T409592: Re-enable Rust tests.
Fri, Nov 7, 8:57 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Abstract Wikipedia Fix-It tasks, function-evaluator

Nov 5 2025

cmassaro added a comment to T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable.

Is this now fixed?

Nov 5 2025, 8:02 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro added a comment to T407939: Provide a new config flag to control rate-limiting of simultaneity.

Do we want to set this to a different value in prod?

Nov 5 2025, 7:59 PM · OKR-Work, function-orchestrator, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro added a comment to T404186: Error claims no connected implementation when it is.

Added specific phab task to handle this case in the orchestrator: https://phabricator.wikimedia.org/T407646

Nov 5 2025, 3:45 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro closed T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner as Resolved.
Nov 5 2025, 3:43 PM · MW-1.46-notes (1.46.0-wmf.2; 2025-11-12), function-evaluator, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata
cmassaro closed T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner, a subtask of T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable, as Resolved.
Nov 5 2025, 3:43 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro closed T409194: Evaluator test suite gives false positives as Resolved.
Nov 5 2025, 3:43 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-schemata, function-evaluator
cmassaro closed T407646: "no connected implementation" error message when implementations are removed from Z8K4 array for validation issues as Resolved.
Nov 5 2025, 3:42 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-orchestrator
cmassaro moved T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner from In Code review to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 5 2025, 4:08 AM · MW-1.46-notes (1.46.0-wmf.2; 2025-11-12), function-evaluator, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata
cmassaro moved T409194: Evaluator test suite gives false positives from Incoming to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 5 2025, 4:07 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-schemata, function-evaluator
cmassaro edited projects for T409194: Evaluator test suite gives false positives, added: Abstract Wikipedia team (26Q2 (Oct–Dec)); removed Abstract Wikipedia team.
Nov 5 2025, 4:07 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-schemata, function-evaluator

Nov 4 2025

cmassaro created T409202: Fix the "bad header" tests.
Nov 4 2025, 5:11 PM · Abstract Wikipedia Fix-It tasks, function-evaluator, Abstract Wikipedia team
cmassaro closed T406352: [26Q2] Do capacity planning as Resolved.
Nov 4 2025, 4:54 PM · Essential-Work, Epic, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro reopened T407646: "no connected implementation" error message when implementations are removed from Z8K4 array for validation issues as "Open".
Nov 4 2025, 4:54 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-orchestrator
cmassaro closed T407646: "no connected implementation" error message when implementations are removed from Z8K4 array for validation issues as Resolved.
Nov 4 2025, 4:54 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-orchestrator
cmassaro closed T406937: Document how to perform capacity planning for future iterations, a subtask of T406352: [26Q2] Do capacity planning, as Resolved.
Nov 4 2025, 4:31 PM · Essential-Work, Epic, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro closed T406937: Document how to perform capacity planning for future iterations as Resolved.
Nov 4 2025, 4:31 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator, function-orchestrator
cmassaro closed T406936: Undertake capacity planning exercise once, on the current system, a subtask of T406352: [26Q2] Do capacity planning, as Resolved.
Nov 4 2025, 4:31 PM · Essential-Work, Epic, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro closed T406936: Undertake capacity planning exercise once, on the current system as Resolved.
Nov 4 2025, 4:31 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator, function-orchestrator
cmassaro claimed T409194: Evaluator test suite gives false positives.
Nov 4 2025, 3:52 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-schemata, function-evaluator
cmassaro triaged T409194: Evaluator test suite gives false positives as High priority.

I have pre-triaged this as high because 1) I won't make the triage meeting tomorrow and 2) this reduces our confidence in the evaluator service as a whole.

Nov 4 2025, 3:52 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-schemata, function-evaluator
cmassaro created T409194: Evaluator test suite gives false positives.
Nov 4 2025, 3:51 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), Patch-For-Review, function-schemata, function-evaluator

Nov 3 2025

cmassaro created T409118: Canonicalization incorrectly called on Z99K1.
Nov 3 2025, 10:05 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-orchestrator
cmassaro moved T407939: Provide a new config flag to control rate-limiting of simultaneity from Ready to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 3 2025, 8:19 PM · OKR-Work, function-orchestrator, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro claimed T407939: Provide a new config flag to control rate-limiting of simultaneity.
Nov 3 2025, 8:19 PM · OKR-Work, function-orchestrator, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro added a subtask for T406346: [26Q2] Orchestrator memory squashing: T409111: Use ORCHESTRATOR_CONFIG.maxSimultaneousExecutions in production.
Nov 3 2025, 8:18 PM · OKR-Work, Epic, function-orchestrator, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro added a parent task for T409111: Use ORCHESTRATOR_CONFIG.maxSimultaneousExecutions in production: T406346: [26Q2] Orchestrator memory squashing.
Nov 3 2025, 8:18 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro created T409111: Use ORCHESTRATOR_CONFIG.maxSimultaneousExecutions in production.
Nov 3 2025, 8:18 PM · OKR-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro added a comment to T407939: Provide a new config flag to control rate-limiting of simultaneity.

We've instead added a separate variable, since these are notionally two different things.

Nov 3 2025, 8:17 PM · OKR-Work, function-orchestrator, Abstract Wikipedia team (26Q2 (Oct–Dec))
cmassaro removed a subtask for T408823: Fail Early if Evaluator Can't Acquire a WASI Runner: T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable.
Nov 3 2025, 1:57 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro added a subtask for T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool: T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable.
Nov 3 2025, 1:57 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro edited parent tasks for T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable, added: T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool; removed: T408823: Fail Early if Evaluator Can't Acquire a WASI Runner.
Nov 3 2025, 1:57 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro added a subtask for T408823: Fail Early if Evaluator Can't Acquire a WASI Runner: T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool.
Nov 3 2025, 1:56 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro added a parent task for T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool: T408823: Fail Early if Evaluator Can't Acquire a WASI Runner.
Nov 3 2025, 1:56 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro updated the task description for T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool.
Nov 3 2025, 1:55 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro updated the task description for T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool.
Nov 3 2025, 1:55 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro claimed T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool.
Nov 3 2025, 1:54 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro created T409071: When evaluator fails due to WASI unavailability, consider queuing the runner back to the executor pool.
Nov 3 2025, 1:54 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator

Nov 1 2025

cmassaro added a subtask for T408823: Fail Early if Evaluator Can't Acquire a WASI Runner: T408977: Set FUNCTION_EVALUATOR_WASI_ACQUIRE_TIMEOUT in production config.
Nov 1 2025, 7:38 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro added a parent task for T408977: Set FUNCTION_EVALUATOR_WASI_ACQUIRE_TIMEOUT in production config: T408823: Fail Early if Evaluator Can't Acquire a WASI Runner.
Nov 1 2025, 7:38 PM · Abstract Wikipedia team (26Q2 (Oct–Dec)), Essential-Work, function-evaluator
cmassaro created T408977: Set FUNCTION_EVALUATOR_WASI_ACQUIRE_TIMEOUT in production config.
Nov 1 2025, 7:38 PM · Abstract Wikipedia team (26Q2 (Oct–Dec)), Essential-Work, function-evaluator
cmassaro created T408976: Migrate most evaluator tests to executorClassTest fixture.
Nov 1 2025, 7:36 PM · Abstract Wikipedia Fix-It tasks, function-evaluator, Abstract Wikipedia team
cmassaro moved T343720: If we create a working evaluator service in Rust, we can measure its performance and stability characteristics and plan to productionize it in a subsequent quarter from Ready to In Engineering on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 1 2025, 1:49 AM · Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, Essential-Work, function-evaluator
cmassaro moved T408823: Fail Early if Evaluator Can't Acquire a WASI Runner from Incoming to In Engineering on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 1 2025, 1:49 AM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro claimed T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable.
Nov 1 2025, 1:48 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro edited projects for T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable, added: Abstract Wikipedia team (26Q2 (Oct–Dec)); removed Abstract Wikipedia team.
Nov 1 2025, 1:48 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro edited parent tasks for T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner, added: T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable; removed: T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners.
Nov 1 2025, 1:47 AM · MW-1.46-notes (1.46.0-wmf.2; 2025-11-12), function-evaluator, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata
cmassaro removed a subtask for T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners: T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner.
Nov 1 2025, 1:47 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro added a subtask for T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable: T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner.
Nov 1 2025, 1:47 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro moved T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners from Incoming to Ready to deploy on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 1 2025, 1:46 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro edited projects for T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners, added: Abstract Wikipedia team (26Q2 (Oct–Dec)); removed Abstract Wikipedia team.
Nov 1 2025, 1:46 AM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro moved T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner from Incoming to In Code review on the Abstract Wikipedia team (26Q2 (Oct–Dec)) board.
Nov 1 2025, 1:46 AM · MW-1.46-notes (1.46.0-wmf.2; 2025-11-12), function-evaluator, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata
cmassaro edited projects for T408823: Fail Early if Evaluator Can't Acquire a WASI Runner, added: Abstract Wikipedia team (26Q2 (Oct–Dec)); removed Abstract Wikipedia team.
Nov 1 2025, 1:45 AM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro edited projects for T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner, added: Abstract Wikipedia team (26Q2 (Oct–Dec)); removed Abstract Wikipedia team.
Nov 1 2025, 1:45 AM · MW-1.46-notes (1.46.0-wmf.2; 2025-11-12), function-evaluator, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata

Oct 31 2025

cmassaro created T408962: Consider calling killProcessTree in Executor.end.
Oct 31 2025, 11:08 PM · Patch-For-Review, Abstract Wikipedia Fix-It tasks, function-evaluator, Abstract Wikipedia team

Oct 30 2025

cmassaro claimed T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners.
Oct 30 2025, 8:39 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro claimed T408824: Allocate new Z500 and HTTP error codes for failure to acquire a WASI runner.
Oct 30 2025, 4:25 PM · MW-1.46-notes (1.46.0-wmf.2; 2025-11-12), function-evaluator, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata
cmassaro claimed T408823: Fail Early if Evaluator Can't Acquire a WASI Runner.
Oct 30 2025, 4:25 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator
cmassaro added a subtask for T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable: T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners.
Oct 30 2025, 4:25 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-evaluator
cmassaro added a parent task for T408825: Make the orchestrator react appropriately when the evaluator fails to acquire WASI runners: T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable.
Oct 30 2025, 4:25 PM · Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-orchestrator
cmassaro added a subtask for T408823: Fail Early if Evaluator Can't Acquire a WASI Runner: T408826: Make the evaluator fail early and respond with the appropriate Z500 and HTTP status code when WASI runners are unavailable.
Oct 30 2025, 4:24 PM · Patch-For-Review, Essential-Work, Abstract Wikipedia team (26Q2 (Oct–Dec)), function-schemata, function-evaluator, function-orchestrator