Page MenuHomePhabricator

[REQUEST] Analysis of available data on clients calling individual services through RESTbase
Closed, DeclinedPublic

Description

Name for main point of contact and contact preference
Virginia Poundstone
Slack @Vpoundstone

What teams or departments is this for?
Platform Product, Platform Engineering, API Platform, API Value Stream

What are your goals? How will you use this data or analysis?
In order to T329419: Architect potential API Gateway Patterns in preparation for services migrated off RESTbase, we need to understand as much as possible about the current callers to the various services routed through RESTbase. As part of gathering requirements for T262315: <CORE TECHNOLOGY> API Migration & RESTbase Sunset, we need to separate the various use cases, so we can create an architectural solution that solves for separation of concerns to the best of our ability.

Product goal:
RESTbase deprecation is a use case that will help us define critical pieces of API infrastructure and prepare API Platform to support full production traffic across new and pre-existing services.

Data Analysis goal:
Define the major "buckets" or types of client callers so we can design an API Architecture with their use cases in mind. This will make the engineering work of developers easier and removes burdens created by technical debt (and mitigate unintentionally creating more technical debt).

A secondary goal is to determine what services are used the most to the least and by whom.

What are the details of your request? Include relevant timelines or deadlines
Details: In the webrequest dataset, what are the total number of hits for the past three months on api/rest_v1 URI path (segmented by all available endpoints) per: user_agent, user_agent_map, access_method, and ip?

In consultation with Dan Andreescu, he said "I think just looking at requests with /api/rest_v1 in the path would be the only way I know how to tell them apart. There may be other clues based on how we call our own apis from the apps/extensions/etc but others may know more than me there."

Timeline:
The goal is to have a routing solution in place for services migrated off RESTbase by end of March and an architectural decision made by mid February (now).

Is this request urgent or time sensitive?
Sadly, yes. RESTbase is falling over, so we are in a catch 22 of spending our time holding it up and creating something better. Engineers rely on RESTbase, so continue to make new services on it compounding the debt. We thought we would be able to get enough visibility in-house via our only SRE, but they have too much on their plate to assemble the research. I'm very sorry to make such a late request.

I'm fairly new to the WMF, and I'm still learning the ropes. =)
Thank you for your help.

Event Timeline

@mpopov this handles one third of T284579: [REQUEST] Data for external API Requests. Harden API Gateway. (thank you for reminding me about it). I will still need data for MediaWiki REST API (rest.php) and MediaWiki Action API (api.php) as well as additional info about RESTbase not included in this scope. I updated the description in T284579 to reflect this. I'll schedule a consultation hour to discuss best approach for that larger ask.

nettrom_WMF lowered the priority of this task from High to Medium.May 16 2023, 4:18 PM
nettrom_WMF moved this task from Doing to Needs Investigation on the Product-Analytics (Kanban) board.

Reducing the priority of this for now and moving it to "Needs Investigation" while I follow up with @VirginiaPoundstone on what API Platform needs.

Closing this task as declined for now. If the need for additional data analysis comes up, feel free to file another request.