Page MenuHomePhabricator

Security Readiness Review For Improved Property Suggester/Recommender for Wikidata
Closed, ResolvedPublic

Description

Project Information

Description of the tool/project:
The team at Vrije Universiteit Amsterdam has created a new backend for the Wikidata Property Suggester/Recommender, that is expected to provide more useful suggestions to Wikidata editors.
Suggestions here mean that the recommender system hints an editor what kinds of statements are likely to be useful/relevant for Wikidata item currently being edited.

Description of how the tool will be used at WMF:
WMDE would like to run A/B test on wikidata.org comparing on the course of around 4 weeks the recommendations and other functionality of the new system, and the current recommender system used on Wikidata. If the A/B test gives positive results, the new recommender will likely be adopted on Wikidata replacing the existing suggester.

The recommender backend runs as a standalone service, and Wikidata (via Property Suggester mediawiki extension) will request recommendation from the backend via HTTP request.

Dependencies

See project's go.mod file

Has this project been reviewed before?
Not quite reviewed but T285098 includes some details on the way how the project integrates into mediawiki/Wikidata, including some security-related considerations

Working test environment
Working recommender server on WMF Cloud VPS: https://recommender.wmcloud.org/recommender
Usage example

curl -d '{"properties":[],"types":[]}' https://recommender.wmcloud.org/recommender

Instructions on setting up on WMF Cloud VPS: https://github.com/martaannaj/RecommenderServer#setting-up-on-cloud-vps

Post-deployment
Team: WMDE, Wikidata
Primary Contact: @karapayneWMDE

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
sbassett changed the task status from Open to Stalled.Oct 5 2021, 5:27 PM
sbassett triaged this task as Low priority.
sbassett subscribed.

Stalling this review for now pending further discussion at T285098. We may still be able to complete this review this quarter (October to December 2021) if a clear path to production, stewardship, etc are determined soon.

@sbassett Opening this request was meant as an indication of WMDE understanding the "fast track" deployment is not an option. Apologies for not being clear about it. I've said it explicitly on T285098 now.

@sbassett Opening this request was meant as an indication of WMDE understanding the "fast track" deployment is not an option. Apologies for not being clear about it. I've said it explicitly on T285098 now.

Ok, just to set expectations, as this system is currently architected with the service living on wmcs and wanting to communicate directly with Wikimedia production (which is still the case unless I see any public plans to update that model), the security review will very likely come back with a high or critical overall risk, requiring WMF c-level acceptance of any residual risk.

as this system is currently architected with the service living on wmcs

This is not how the system is currently architected.
The main architecture of note here is 1) an already deployed MediaWiki extension that makes an API call to a service and 2) the service
The Service can run anywhere, the idea of T285098 was to first deploy this to wmcs to prove the system is better than the existing system and then to go with a rollout in production and needing to allocate production resources for what is initially an A/B test.
As we seemingly are not able to do this per the discussion in that ticket, this the target of this service will be deployment to wikimedia production.

and wanting to communicate directly with Wikimedia production

This is probably relevant but in both a production and wmcs deployment situation this communication is one way.
This is production communication with this service (the recommender API)
This is as described in the description of T285098 including example payloads

$response = MediaWikiServices::getInstance()->getHttpRequestFactory()->post(
			$this->schemaTreeSuggesterUrl,
			[
				'postData' => json_encode( [
					'Properties' => $properties,
					'Types' => $types
				] ),
				'timeout' => 1
			],
			__METHOD__
		);

And the data would look something like this:

{
  "properties": [
    "P1"
    "P2452",
    "P3513551"
  ],
  "types": [
    "Q13",
    "Q245125",
    "Q1314"
  ]
}

Quick follow up incase the intent of this ticket was misunderstood.
This is a security review request for deploying the service to Wikimedia Production, not to WMCS, as that was ruled out as an option in T285098 (at least as far as we can tell)

Quick follow up incase the intent of this ticket was misunderstood.
This is a security review request for deploying the service to Wikimedia Production, not to WMCS, as that was ruled out as an option in T285098 (at least as far as we can tell)

Fair enough, we'll be sure to characterize the review via that lens. For now, the Security-Team is attempting to determine the best path forward, be it a vendor proposal and review or attempting to schedule this review with existing Foundation resources. We should have an answer for everyone soon.

Great stuff, thanks @sbasset. Any success in finding out how to crunch that one?

Hey @WMDE-leszek - We're still working through some possibilities for engaging a vendor for this work. Hopefully I can have an answer in another week or so for you and your team. If the vendor path falls through, we'd likely need to schedule this review for early next quarter (January 2022 - March 2022), but there are options for risk acceptance/ownership if that scheduling estimate does not align with your desired production deployment date.

sbassett changed the task status from Stalled to In Progress.Nov 4 2021, 8:36 PM
sbassett assigned this task to Reedy.
sbassett raised the priority of this task from Low to Medium.
sbassett moved this task from Upcoming Quarter Planning Queue to In Progress on the secscrum board.

Thanks @sbassett. If getting a vendor review does not work out, WMDE is interested in accepting/owning the risk of the service (especially for the initially intended four-week period), as 2022 is quite beyond our launch schedule plans indeed.

Hey @WMDE-leszek - we're going to have @Reedy give this a first look for a security review. Hopefully they can have a report deliverable for you later this quarter or early next. At that point we can reassess any additional needs.

https://github.com/martaannaj/RecommenderServer/issues/2 filed to add a security policy to the repo (especially as it's not being (primarily?) maintained by WMDE).

Also the fact that no CI stuff seems to be being run in GitHub isn't a good sign.

Sure, it's built around blubber, but it's not in Gerrit and therefore not running under our CI.

https://github.com/martaannaj/RecommenderServer/issues/3 filed for that

I don't know how comprehensive the tests actually are (maybe the lack of is also an issue), but not running them at all isn't good either.

And no sort of even basic linting done either...

https://github.com/martaannaj/RecommenderServer/issues/5 filed for doing that as a GitHub Action

https://github.com/martaannaj/RecommenderServer/issues/4 filed for running gosec as a GitHub Action too

For the security policy, I created a first draft. I would prefer to use phabricator for any security issues. Smaller issues can go as issues on github. https://github.com/martaannaj/RecommenderServer/blob/main/SECURITY.md

I have tried adding the CI (linting + gosec) to github now. This is triggered upon pushing to main and at a weekly interval.
We still need to ad the running of tests to the CI.

There are also still some minor issues to be fixed based on the linting/gosec report. These are specifically related to error checks not happening as they should and some unused code. We should find some time to fix these by the end of the week.

sbassett changed the task status from In Progress to Stalled.Nov 16 2021, 4:53 PM

Stalling until more security/linting automation has been officially set up in CI. We'll then plan to use the results of some of that tooling, in addition to some manual review/pen-testing, to formulate a final application security review deliverable on this task.

Thanks.

I will note that advocating security bugs be published in public isn't the best. Especially when you refer to (severe) security vulnerability, and leaving it to peoples judgement. What counts as severe? Against what scale are we comparing?

I would also generally advise on keeping issues on github open and then only using it for some stuff, while suggesting people file a bug here for other stuff. The barrier to entry to create an account here is a little higher, and is more work for people to report something.

We can create a specific project on Phabricator for it if it's helpful for longer term maintenance and management. That's easily and trivially done.

If this is to be deployed into Wikimedia production (I know, that's TBD and down the line), the canonical repo will need presumably moving to Gerrit, and therefore the issue tracker is not going to be GitHub.

Also, for your instructions.... They need to create an account (via wikitech or Wikipedia et al), then find https://phabricator.wikimedia.org/maniphest/task/edit/form/75/ via the menu at the top...

But using that form you can't add other projects initially, they have to save it, and then add them. This isn't obvious, but is done for specific reasons.

It's also not clear where you want people to Add Michaelcochez, Martaannaj and Wikidata to the issue. We don't want Wikidata adding as a subscriber, for example, which happens regularly (not specifically wikidata, but certainly adding random projects). It creates more work for people triaging bugs.

And then "Security updates are only provided for the latest version of this project." - What's the latest version? There's no tags, no releases. So maybe you're meaning HEAD of the main branch?

If it helps, and while it is intended and/or is deployed for Wikimedia purposes, reports could just be made to security@wikimedia.org like any other Wikimedia project.

We went forward fixing multiple aspects:

  1. Linting, go-sec and running the tests is now part of the CI in the main branch on github (@sbassett ) We fixed the flagged issues.
    1. as part of this, we added several test cases back which we had removed to make the repository more lean (we assumed that less code to review would be better).
    2. as part of this process we also removed some more code, which was not used by the serving code specifically.
  2. We modified the SECURITY.md file and also refer to it from the main README.md

@Reedy a project on Phabricator might indeed be a good idea.

@Michaelcochez - Thanks for getting gosec set up within the project's Github CI. Just reviewing some recent runs, it doesn't seem like it's found much, which is good, and we'd likely rate that as low risk for now, but I'll let @Reedy make that call as this is his review.

Another tool that might be helpful is go-kart, which is somewhat of a complement/alternative to gosec FWIU, and it looks like there's a convenient way to set it up as a Github action here. semgrep also has a golang policy ("p/golang") consisting of about 24 rules right now. I'd also recommend using at least some tool to scan for vulnerable packages in addition to Github's recent Advisories support for golang. Nancy or even the free/foss tier of snyk should work, though the latter obviously has some limits re: tests per month, etc. Talking with some snyk sales reps recently, they are allegedly coming out with a pure non-profit license, which I'm hopeful might work well and be less limited for the entire Wikimedia developer community/ecosystem.

I have now added gokart. The github action was not working out of the box, because of some missing configuration parameters in the example. I opened a pull request for that.

Then, I also added nancy to scan packages and enabled Dependabot alerts.

It seems I cannot configure semgrep as a github action, and I am uncomfortable giving the website access to my github account.

I have now added gokart. The github action was not working out of the box, because of some missing configuration parameters in the example. I opened a pull request for that.

Great.

Then, I also added nancy to scan packages and enabled Dependabot alerts.

Great.

It seems I cannot configure semgrep as a github action, and I am uncomfortable giving the website access to my github account.

Yes, I wouldn't set up any version of semgrep that depended upon semgrep.dev (or untrusted images) except for maybe talking to their registry. I think the worst case would be manually setting up a github action that uses a python image, installing semgrep via pip (or whatever) and then running the cli like: semgrep --config=p/golang --metrics=off. I believe this should just pull the golang policy from their registry and not report any pseudonymous feedback back to semgrep.dev. Anyhow, this is more a suggestion with both gosec and gokart running for SAST.

And if any of these tools become too noisy, they can likely be disabled or further tweaked, especially if there are noisy rules.

Thanks @sbassett . I didn't realize it was possible to run semgrep without posting the results to their service. They actually have a configuration available which can be used without that feature. I have now configured that.

Semgrep actually found an issue. The issue flagged is that we do start an http server without TLS (meaning http instead of https). It is easy to change this to the version with a certificate and key, but in the current setting traffic to https://recommender.wmcloud.org/recommender gets forwarded to an internal port on the server hosting the schematree, which is not reachable directly from the outside world.

So, my question is:

  1. should we solve this by also having this internal service use https ?
  2. and if so, where would i get a certificate/key for that?
  1. should we solve this by also having this internal service use https ?
  2. and if so, where would i get a certificate/key for that?

I believe it'd be a similar setup to wmcloud, i.e. a reverse proxy to the app, if this service-related doc is correct. This would be a good thing to confirm with SRE, likely within the context of a new service request.

@sbassett Is that something which should be checked now, during the security readiness review, or only later upon deployment?

I have added the TLS option to the implementation, but the fact that we still allow starting a http version remains flagged.

@sbassett Is that something which should be checked now, during the security readiness review, or only later upon deployment?

I have added the TLS option to the implementation, but the fact that we still allow starting a http version remains flagged.

I believe that in the context of a Wikimedia production service deploy, this rule would likely be a false positive result, but again, something to confirm with SRE. For a local dev environment, it also seems unnecessary. The only time it would be a true positive, in my opinion, would be if the service did not have any kind of reverse proxy for TLS termination, i.e. it was directly exposed to the internet.

We're going to resolve this for now as low risk since none of the new security tooling added to the Github repo has returned any medium+ risk actionable issues. One caveat would be noting (in the README or wherever) as a kinda-false-positive (and possibly suppressing) the TLS issue found by semgrep so as not to cause any future concern. Otherwise, consider this unblocked from an Application Security Reviews perspective.

The new Github security policy LGTM, +1.