New Service Request Security API Gateway
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	sbassett
	Sep 13 2021, 8:27 PM

Description

Name: Security API Gateway

Description: There is an increasing need for a centralized Wikimedia service capable of making available certain security-related APIs for various MediaWiki extensions, services, external applications and users. Due to certain sensitive elements (sensitive data, commercially-licensed data, etc.) this service would need to live within Wikimedia production, have some variety of general authn/z mechanism and be highly available. Initial API candidates would likely be feed options related to the protected task T265845 and T250227.

Timeline: Tentatively by the end of Q3 2022 (March 2022)

Point person(s): @sbassett, @Reedy, @Mstyles, @STran

Technologies: Likely service-runner and various nodejs glue code to manage ingestion/consumption and authn/z layers.

Request flow diagram: To be created, though these will likely be minimal as this would be more of a stand-alone API service with its own authn/z.

n.b. keeping Service-deployment-requests untagged for now as this effort is currently very early in the initial planning stage (more proof-of-concept, minimum-viable-product) and might require an RFC or similar technical discussion phase.

Related Objects
Search...

Status	Assigned	Task
Declined	None	T290917 New Service Request Security API Gateway
Resolved	Mstyles	T293416 Fork service-template-node and create new gerrit repository
Resolved	Mstyles	T293417 Determine best available auth mechanism for the initial Security API use-case
Resolved	None	T293418 Determine API endpoints for initial Security API use-case
Resolved	STran	T294782 Build data feed consumption tool
Resolved	Mstyles	T296346 Create Demo Environment for Security API
Duplicate	STran	T305713 Create a basic product API to MySQL/MariaDB ETL
Resolved	Mstyles	T305714 Complete, verify and test docker-compose environment
Declined	sbassett	T305715 Work on mocha/swagger tests to have features appropriately mocked or otherwise passing
Resolved	sbassett	T305716 Schedule placeholder for demo to Security/AHT leadership
Declined	None	T293419 Prepare service for beta cluster deployment
Open	None	T308789 Determine CI best practices for service which connects to MySQL
Resolved	sbassett	T309213 Create a mock vendor API endpoint at toolforge for the Security API service
Resolved	STran	T297757 Build Security API front-end extension
Resolved	sbassett	T301400 Update security-api service to use nodejs12-slim and nodejs12-devel images
Resolved	Mstyles	T301428 Security API Storage Needs
Resolved	• Marostegui	T305114 Set up MariaDB for iPoid
Resolved	Mstyles	T305723 Verify pielinelib/blubber config for production deployment
Resolved	kostajh	T305724 Investigate database data invalidation questions and chunked/timed API to MySQL/MariaDB ETL
Resolved	Aklapper	T305728 Create project tag for Security API Service
Resolved	sbassett	T310564 Have the Security API Service's docker-compose use a custom network

Event Timeline

sbassett created this task.Sep 13 2021, 8:27 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 13 2021, 8:27 PM

sbassett mentioned this in T288844: Update MaxMind GeoIP2 license key and product IDs for application servers.Sep 13 2021, 8:31 PM

I suppose two initial questions/decisions would be:

Should we architect the (likely small but important) collection of security-related APIs/services as single, individual services or as part of a gateway, as the task suggests?
Is Wikimedia production even the right place for this? I think it could be, but I don't have a great answer on that. wmcs/toolforge can't work for a few security/privacy/TOU-related reasons. And there is now precedent for certain Important™ things to live elsewhere (WME/Okapi at AWS) but that presents a different set of challenges, while removing others.

JJMC89 subscribed.Sep 13 2021, 9:29 PM

Thanks for filing this :) Sidenote: the name api-gateway is already in use (naming things is hard!!), it would be nice if this one could be slightly different.

this would be more of a stand-alone API service with its own authn/z.

I am curious what the advantages of separate authn/z are versus fronting it with a MediaWiki extension that takes care of authn/z before proxying the request to this service.

In T290917#7349993, @sbassett wrote:

I suppose two initial questions/decisions would be:

Should we architect the (likely small but important) collection of security-related APIs/services as single, individual services or as part of a gateway, as the task suggests?

I think it depends on how you expect clients to use it, will each one likely be calling just one service or will they all want multiple? Would clients benefit from the gateway aggregating multiple services in one respnse? Also to consider is if we want to change in the future, are all the clients under shared maintenance in which we can expect maintainers to quickly adapt to new things or will it require long deprecation periods?

My suggestion would be to build it as a gateway but each API/service is its own route, which is how Shellbox works. Then it's trivial to have one gateway deployment handle everything, or if we decide we want to split them into different deployments (e.g. want to dedicate more resources to one specific service), it'll pretty straightforward to spin up a new deployment in k8s using the same codebase (e.g. shellbox-constraints, shellbox-syntaxhighlight, etc.)

Is Wikimedia production even the right place for this? I think it could be, but I don't have a great answer on that. wmcs/toolforge can't work for a few security/privacy/TOU-related reasons. And there is now precedent for certain Important™ things to live elsewhere (WME/Okapi at AWS) but that presents a different set of challenges, while removing others.

My understanding is that Wikimedia Enterprise is using AWS because SRE couldn't/didn't want to provide contractual level SLAs (https://meta.wikimedia.org/wiki/Wikimedia_Enterprise/FAQ#Why_are_you_using_externally-operated_cloud_infrastructure/AWS) - I don't think that's applicable here and not a precedent. Otherwise I think production k8s is a pretty good fit for this. Things you get "for free": logging to logstash, monitoring and alerting, metrics to grafana, security monitoring via debmonitor, and probably more.

sbassett moved this task from Backlog to In Progress on the user-sbassett board.Sep 15 2021, 8:24 PM

taavi subscribed.Sep 19 2021, 8:32 AM

@sbassett before I can give any opinion on your proposal and your questions, I'd need to better understand (maybe one practical example?) of what the request flow would be, and what is the intent for this software.

From what I read I would say it would look indeed like we could just deploy a second instance of api-gateway with a different configuration of routes. But maybe I'm missing something here.

So, can you describe one practical use-case for this in some detail? It will help me ensure I don't give you bad advice :)

Reedy moved this task from Incoming to Back Orders on the Security-Team board.Sep 20 2021, 3:37 PM

Tks4Fish subscribed.Sep 22 2021, 4:04 PM

In T290917#7350710, @Legoktm wrote:

Thanks for filing this :) Sidenote: the name api-gateway is already in use (naming things is hard!!), it would be nice if this one could be slightly different.

Is security-api-gateway too close? security-api? wikimedia-security-api? I guess I'd hate to get too far away from simple names that precisely describe what the thing is.

I am curious what the advantages of separate authn/z are versus fronting it with a MediaWiki extension that takes care of authn/z before proxying the request to this service.

Probably none for right now, at least for initial use-cases. Though in the future it might provide us with increased flexibility for allowing integrations with new apps, tools, external organizations, etc. I know there are also some ideas floating around in regards to decoupling auth from mediawiki and other apps/services, though that's obviously well beyond the scope of this work for now.

I think it depends on how you expect clients to use it, will each one likely be calling just one service or will they all want multiple? Would clients benefit from the gateway aggregating multiple services in one respnse? Also to consider is if we want to change in the future, are all the clients under shared maintenance in which we can expect maintainers to quickly adapt to new things or will it require long deprecation periods?

My suggestion would be to build it as a gateway but each API/service is its own route, which is how Shellbox works. Then it's trivial to have one gateway deployment handle everything, or if we decide we want to split them into different deployments (e.g. want to dedicate more resources to one specific service), it'll pretty straightforward to spin up a new deployment in k8s using the same codebase (e.g. shellbox-constraints, shellbox-syntaxhighlight, etc.)

The route(s)-per-service concept makes sense to me and would allow a lot of flexibility in spinning up or down various security-related services. So, sample routes following a pattern along the lines of: /service-name/version/endpoint1/whatever. Things like Shellbox certainly make sense as stand-alone services, but there are a handful of security-related services that could live under a single, monolithic api gateway IMO.

My understanding is that Wikimedia Enterprise is using AWS because SRE couldn't/didn't want to provide contractual level SLAs (https://meta.wikimedia.org/wiki/Wikimedia_Enterprise/FAQ#Why_are_you_using_externally-operated_cloud_infrastructure/AWS) - I don't think that's applicable here and not a precedent. Otherwise I think production k8s is a pretty good fit for this. Things you get "for free": logging to logstash, monitoring and alerting, metrics to grafana, security monitoring via debmonitor, and probably more.

Yes, a Wikimedia production service does probably make the most sense for something like this, given the dissimilar requirements for this and something like WME.

In T290917#7365362, @Joe wrote:

@sbassett before I can give any opinion on your proposal and your questions, I'd need to better understand (maybe one practical example?) of what the request flow would be, and what is the intent for this software.

From what I read I would say it would look indeed like we could just deploy a second instance of api-gateway with a different configuration of routes. But maybe I'm missing something here.

So, can you describe one practical use-case for this in some detail? It will help me ensure I don't give you bad advice :)

Hey @Joe -

Sure, the first use-case for the security api gateway would be a couple of simple routes for clients to access data from a commercially-licensed deny-list. Much of the background for this is discussed, at length, within T265845 (I just subbed you there). I would envision a couple of basic routes like search, which could take an IP address, CIDR block, etc. and provide results and perhaps a diff route which would provide a list of current IP addresses based upon the vendor's updated feed (daily or 5-minute intervals, depending upon the product). This data would be consumable by various Wikimedia bots, MW extensions and potentially other, related security-tooling down the road. I understand that there is a similar use-case for MaxMind data within Wikimedia production, but that approach appeared to be a bit more inflexible to me for how the community and various WMF teams might want to easily consume relevant data. If that is not the case, then that approach might obviate the need for a security api service like this, but I would want to ensure the flexibility in being able to quickly add or remove security-related services and provide convenient access to these services for various applications and tools.

• Elitre subscribed.Oct 5 2021, 3:11 PM

sbassett mentioned this in T289579: <Security Initiative> P2P Proxy API.Oct 14 2021, 3:29 PM

L235 subscribed.Oct 14 2021, 3:40 PM

MarioGom subscribed.Oct 14 2021, 5:02 PM

Blablubbs subscribed.Oct 14 2021, 5:08 PM

STei-WMF subscribed.Oct 18 2021, 10:38 AM

sbassett triaged this task as Medium priority.Oct 18 2021, 4:17 PM

sbassett updated the task description. (Show Details)

AntiCompositeNumber subscribed.Oct 22 2021, 7:23 PM

sbassett closed subtask T293416: Fork service-template-node and create new gerrit repository as Resolved.Nov 1 2021, 5:04 PM

sbassett mentioned this in T293418: Determine API endpoints for initial Security API use-case.

TheresNoTime subscribed.Nov 5 2021, 12:42 PM

@Reedy mentioned that we probably should create a Phab project for this work, at some point.

STran mentioned this in T297243: Add CI for mediawiki/extensions/SecurityApi.Dec 7 2021, 9:45 PM

Mstyles closed subtask T293418: Determine API endpoints for initial Security API use-case as Resolved.Dec 14 2021, 9:41 PM

Mstyles closed subtask T293417: Determine best available auth mechanism for the initial Security API use-case as Resolved.

MarcoAurelio subscribed.Jan 24 2022, 3:56 PM

sbassett changed the status of subtask T301400: Update security-api service to use nodejs12-slim and nodejs12-devel images from Open to In Progress.Feb 9 2022, 8:14 PM

sbassett closed subtask T301400: Update security-api service to use nodejs12-slim and nodejs12-devel images as Resolved.Feb 10 2022, 2:48 AM

sbassett closed subtask T301428: Security API Storage Needs as Resolved.Mar 14 2022, 3:52 PM

AntiCompositeNumber mentioned this in T303774: Investigate the practice of making thousands of global blocks per day on Meta-Wiki.Mar 14 2022, 11:08 PM

sbassett updated the task description. (Show Details)Mar 15 2022, 2:51 PM

sbassett added a subscriber: STran.

jcrespo mentioned this in T305114: Set up MariaDB for iPoid.Mar 31 2022, 6:09 PM

sbassett closed subtask T297757: Build Security API front-end extension as Resolved.Apr 6 2022, 7:20 PM

sbassett mentioned this in T305728: Create project tag for Security API Service.Apr 8 2022, 5:00 PM

sbassett added a subtask: T305728: Create project tag for Security API Service.Apr 12 2022, 2:49 PM

Aklapper closed subtask T305728: Create project tag for Security API Service as Resolved.Apr 12 2022, 4:55 PM

Aklapper added a project: iPoid-Service.

sbassett closed subtask T305723: Verify pielinelib/blubber config for production deployment as Resolved.May 4 2022, 8:37 PM

sbassett changed the status of subtask T310564: Have the Security API Service's docker-compose use a custom network from Open to In Progress.Jun 14 2022, 2:54 AM

sbassett closed subtask T310564: Have the Security API Service's docker-compose use a custom network as Resolved.Jun 16 2022, 8:14 PM

sbassett moved this task from In Progress to Waiting on the user-sbassett board.Jun 21 2022, 2:34 PM

Mstyles closed subtask T296346: Create Demo Environment for Security API as Resolved.Jul 6 2022, 5:14 PM

STran mentioned this in T325147: New Service Request 'iPoid'.Dec 14 2022, 11:58 AM

@sbassett I'm untagging iPoid-Service given that we have T325147: New Service Request 'iPoid'.

In T290917#8894416, @kostajh wrote:

@sbassett I'm untagging iPoid-Service given that we have T325147: New Service Request 'iPoid'.

Sounds good. I'll decline this one in favor of that one. Not sure what sub-tasks here are still relevant, though I imagine those can be declined or added to the new task at some point.

sbassett edited projects, added SecTeam-Processed; removed user-sbassett, Security, Security-Team.Jun 1 2023, 3:45 PM

kostajh closed subtask T293419: Prepare service for beta cluster deployment as Declined.Jun 7 2023, 11:16 AM

kostajh mentioned this in T293419: Prepare service for beta cluster deployment.

• Marostegui closed subtask T305114: Set up MariaDB for iPoid as Resolved.Jun 14 2023, 9:55 AM

STran closed subtask T294782: Build data feed consumption tool as Resolved.Jun 16 2023, 8:25 AM

sbassett removed a subtask: T305727: Improve service and extension documentation and related configuration guidelines.Jun 27 2023, 3:05 PM

kostajh closed subtask T305724: Investigate database data invalidation questions and chunked/timed API to MySQL/MariaDB ETL as Resolved.Jul 5 2023, 11:04 AM

jijiki reopened subtask T305114: Set up MariaDB for iPoid as Open.Nov 8 2023, 10:57 AM

• Marostegui closed subtask T305114: Set up MariaDB for iPoid as Resolved.Nov 8 2023, 2:39 PM

Johannnes89 subscribed.Wed, Apr 17, 7:12 AM