As a search dev, I want to stop supporting (and remove) ApiFeatureUsage, as it is infrequently used service that requires disproportionate work to maintain, so that I can upgrade ElasticSearch to 7.10 more cleanly.
AC:
- ApiFeatureUsage is removed
As a search dev, I want to stop supporting (and remove) ApiFeatureUsage, as it is infrequently used service that requires disproportionate work to maintain, so that I can upgrade ElasticSearch to 7.10 more cleanly.
AC:
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T248925 Make MediaWiki release tarball compatible with PHP 8.0 | |||
Resolved | Jdforrester-WMF | T300463 Make PHP 8.0 voting on MW master | |||
Resolved | None | T283275 Make MW master tests pass on PHP 8.0 | |||
Resolved | Reedy | T268861 CirrusSearch uses Elastica's Match class | |||
Resolved | Reedy | T268863 Translate uses Elastica's Match class | |||
Resolved | matthiasmullie | T268866 WikibaseMediaInfo uses Elastica's Match class | |||
Invalid | None | T268864 WikibaseCirrusSearch uses Elastica's Match class | |||
Resolved | Reedy | T268865 WikibaseLexemeCirrusSearch uses Elastica's Match class | |||
Resolved | EBernhardson | T271777 Bump rufin/elastica (and related libraries) to versions that support PHP 8.0 | |||
Resolved | Gehel | T263142 [EPIC] Upgrade Elasticsearch to version 7.10 | |||
Declined | None | T301724 Sunset ApiFeatureUsage | |||
Declined | None | T302638 Sunset ApiFeatureUsage (TDMP) |
@MPhamWMF: Assuming this is about the ApiFeatureUsage extension at https://www.mediawiki.org/wiki/Extension:ApiFeatureUsage currently deployed on Wikimedia wikis. Is this about undeploying and sunsetting? (cf T294329)
https://www.mediawiki.org/wiki/Developers/Maintainers lists "Core Platform Team" as code stewards.
@Aklapper I'm not sure I understand the differentiation, if any, between undeploying and sunsetting (vs decommissioning vs killing)? My team spends a lot of time and effort maintaining this extension everytime we need to update our software, and it seems to be barely used last time we checked, and no other product or tech teams have responded about this extension being useful/valuable to them.
oops, my bad. I think i misread something, and there is no difference between undeploying and sunsetting (too many terms to keep up with sometimes)
I was curious:
My question is: Is this worth it? Which of these is critical and would hurt us if we couldn't access it any more via Special:ApiFeatureUsage or the featureusage API? I mean, the data is still there, even without the Elastic store and the extension, as far as I got it.
The special page lets normal API users find deprecation errors related to their User-Agent.
Yeah. T302638: Sunset ApiFeatureUsage (TDMP) was declined because it was "Withdrawn from Process"; in this case the Tech Decision Forum.
If it doesn't go through the right processes, how can it be expected to be approved?
The ticket was declined because @SWakiyama cleared my team to move forward with this work without pushing it through the rest of the Tech Decision process.
I was under the understanding that this unblocked this work for the search team.
There was no comment left to that effect on either this or that ticket.
How would anyone else necessarily know this?
There's also various unanswered questions such as those from Cole in T302638#7811894.
Nor the fact of what we do about the current people who actually use this.
Sorry for all the confusion. This process is new to me, and I don't yet fully understand how it works and what communication practices/norms are.
There was no comment left to that effect on either this or that ticket.
How would anyone else necessarily know this?
I was in communcation with Linh via email about the process and let him know that we were moving ahead with this work. I wasn't personally aware that this was being tracked in the phab ticket in a certain way (I'm not on Phabricator that much for my day to day PM work, and don't really use it to track all my personal work). I incorrectly assumed that the ticket was closed in an appropriate manner.
There's also various unanswered questions such as those from Cole in T302638#7811894.
I do not have the technical answers to these implementation questions. Perhaps @Gehel or @dcausse know more than I do.
Nor the fact of what we do about the current people who actually use this.
Current people using this feature will be notified in advance that this extension will no longer be usable for the foreseeable future until it is able to be rebuilt/redeployed without the Elasticsearch dependency. Search is not providing any workarounds in the meantime, and will not own this feature.
This might not fully answer your question but I'll try :)
Based on the assumption that this feature is not vital to bot owners but rather to engineers working on the WMF infra willing to help bot owners change their tool because of perf issue or params deprecation:
Would using the hive table event.mediawiki_api_request sufficient for this purpose?
SELECT performer.user_text as user_text, count(*) as cnt FROM event.mediawiki_api_request WHERE params["list"]="allpages" AND year=2022 AND month=5 and day=19 GROUP BY performer.user_text ORDER by cnt DESC LIMIT 10;
Obviously this does not allow users without access to hive to run it but might help in some circumstances by creating a list of bots that need to be updated.
Stepping back I wonder how this feature would look like if we had to design it today:
So perhaps I'd consider using the existing mediawiki_api_request and have a job that transforms this data and ingest it into a cassandra db with a small serving layer on top.
That said the ultimate goal here is to simplify things not to create new work :)
If we break the inter-dependency between the ELK cluster and the search cluster we would be in a better shape already but this might not be enough.
Constraints on the search side are during version upgrades and maintenance operations and sensitivity of the data:
These are one of the points we would like to stop worrying about.
For social interaction we already manage non-search indices on the cluster, we generally file a task addressed to the index owner prior to important maintenance operations but we can't obviously accept too many different kind of indices, as of today we have:
So I guess that if a team would accept to own this extension and data this would be a step in the right direction?