Page MenuHomePhabricator

Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites
Open, MediumPublic

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Aklapper, please add any use cases that might be of interest to you. I'll do the same.

This is definitely not all, but one part of the puzzle is searching operations/puppet repo for "git::clone" to get this kind of list:

1modules/jupyterhub_old/manifests/init.pp: git::clone { $wheels_repo:
2modules/labs_vagrant/manifests/init.pp: git::clone { 'vagrant':
3modules/wikimetrics/manifests/base.pp: git::clone { 'wikimetrics-deploy':
4modules/wikimetrics/manifests/base.pp: git::clone { 'wikimetrics':
5modules/releases/manifests/init.pp: git::clone { 'mediawiki/core':
6modules/releases/manifests/init.pp: git::clone { 'mediawiki/tools/release':
7modules/scap/manifests/l10nupdate.pp: git::clone { 'mediawiki/core':
8modules/scap/manifests/l10nupdate.pp: git::clone { 'mediawiki/extensions':
9modules/scap/manifests/l10nupdate.pp: git::clone { 'mediawiki/skins':
10modules/scap/manifests/master.pp: git::clone { 'operations/mediawiki-config':
11modules/beta/manifests/autoupdater.pp: git::clone { 'beta-mediawiki-core':
12modules/beta/manifests/autoupdater.pp: git::clone { 'beta-portal':
13modules/beta/manifests/autoupdater.pp: git::clone { 'beta-mediawiki-extensions':
14modules/beta/manifests/autoupdater.pp: git::clone { 'beta-mediawiki-skins':
15modules/beta/manifests/autoupdater.pp: git::clone { 'mediawiki/vendor':
16modules/puppet_compiler/manifests/init.pp: git::clone { 'operations/puppet':
17modules/puppet_compiler/manifests/init.pp: git::clone { 'labs/private':
18modules/git/spec/defines/clone_spec.rb:describe 'git::clone' do
19modules/git/manifests/install.pp: git::clone{$title:
20modules/git/manifests/clone.pp:# Definition: git::clone
21modules/git/manifests/clone.pp:# git::clone { 'my_clone_name':
22modules/git/manifests/clone.pp:# git::clone { 'analytics/wikimetrics':
23modules/git/manifests/clone.pp:define git::clone(
24modules/vagrant/manifests/mediawiki.pp: git::clone { 'mediawiki/vagrant':
25modules/eventschemas/manifests/init.pp:# [*ensure*] Passed directly to git::clone. Default: latest.
26modules/eventschemas/manifests/init.pp: git::clone { 'mediawiki/event-schemas':
27modules/snapshot/manifests/cron/wikidatadumps/common.pp: git::clone { 'DCAT-AP':
28modules/statistics/manifests/discovery.pp: git::clone { 'wikimedia/discovery/golden':
29modules/statistics/manifests/wmde/wdcm.pp: git::clone { 'analytics/wmde/WDCM':
30modules/statistics/manifests/wmde/graphite.pp: git::clone { 'wmde/scripts':
31modules/statistics/manifests/wmde/graphite.pp: git::clone { 'wmde/toolkit-analyzer-build':
32modules/statistics/manifests/aggregator/projectview.pp: git::clone { 'aggregator_projectview_data':
33modules/statistics/manifests/compute.pp: git::clone { 'statistics_mediawiki':
34modules/statistics/manifests/sites/analytics.pp: git::clone { 'analytics.wikimedia.org':
35modules/statistics/manifests/sites/stats.pp: git::clone { 'wikistats-v2':
36modules/statistics/manifests/aggregator.pp: git::clone { 'aggregator_code':
37modules/service/manifests/deploy/gitclone.pp: git::clone { $repository:
38modules/jupyterhub/manifests/init.pp: git::clone { $deploy_repository:
39modules/extdist/manifests/init.pp: git::clone {'labs/tools/extdist':
40modules/extdist/manifests/init.pp: git::clone { 'integration/composer':
41modules/visualdiff/manifests/init.pp: git::clone { 'integration/visualdiff':
42modules/contint/manifests/phpunit.pp: git::clone { 'jenkins CI phpunit':
43modules/contint/manifests/slave_scripts.pp: git::clone { 'jenkins CI slave scripts':
44modules/contint/manifests/composer.pp: git::clone { 'jenkins CI Composer':
45modules/authdns/spec/classes/authdns_spec.rb: 'define git::clone($directory, $origin, $branch,$owner,$group) {}',
46modules/authdns/manifests/init.pp: git::clone { $workingdir:
47modules/reportupdater/manifests/init.pp: git::clone { 'analytics/reportupdater':
48modules/reportupdater/manifests/job.pp: git::clone { $repository_name:
49modules/noc/manifests/init.pp: git::clone { 'operations/software/dbtree':
50modules/tendril/manifests/init.pp: git::clone { 'operations/software/tendril':
51modules/quarry/manifests/base.pp: git::clone { 'analytics/quarry/web':
52modules/sentry/manifests/init.pp: git::clone { 'operations/software/sentry':
53Binary file modules/role/manifests/grafana/.base.pp.swp matches
54modules/role/manifests/labs/ores/staging.pp: git::clone { 'ores-wm-config':
55modules/role/manifests/labs/db/common.pp: git::clone { 'operations/mediawiki-config':
56modules/role/manifests/xhgui/app.pp: git::clone { 'operations/software/xhprof':
57modules/role/manifests/xhgui/app.pp: git::clone { 'operations/software/xhgui':
58modules/mediawiki_singlenode/manifests/init.pp: git::clone { 'vendor':
59modules/mediawiki_singlenode/manifests/init.pp: git::clone { 'mediawiki':
60modules/mediawiki_singlenode/manifests/mwextension.pp: git::clone { $name:
61modules/profile/manifests/ci/gitcache.pp: git::clone { 'operations/puppet':
62modules/profile/manifests/ci/gitcache.pp: git::clone { 'mediawiki/core':
63modules/profile/manifests/ci/gitcache.pp: git::clone { 'mediawiki/vendor':
64modules/profile/manifests/zuul/server.pp: git::clone { 'integration/config':
65modules/profile/manifests/analytics/refinery/source.pp: git::clone { 'refinery_source':
66modules/profile/manifests/switchdc.pp: git::clone { 'operations-switchdc':
67modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/rainbow':
68modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/twilightsparql':
69modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/prince':
70modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/wetzel':
71modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/wonderbolt':
72modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/rainbow':
73modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/twilightsparql':
74modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/prince':
75modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/wetzel':
76modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/wonderbolt':
77modules/profile/manifests/microsites/annualreport.pp: git::clone { 'wikimedia/annualreport':
78modules/profile/manifests/microsites/design.pp: git::clone { 'design/landing-page':
79modules/profile/manifests/microsites/design.pp: git::clone { 'design/style-guide':
80modules/profile/manifests/microsites/wikibase.pp: git::clone { 'wikibase/wikiba.se-deploy':
81modules/profile/manifests/microsites/transparency.pp: git::clone { 'wikimedia/TransparencyReport':
82modules/profile/manifests/microsites/transparency.pp: git::clone { 'wikimedia/TransparencyReport-private':
83modules/profile/manifests/microsites/research.pp: git::clone { 'research/landing-page':
84modules/profile/manifests/calico/builder.pp: git::clone{ 'operations/calico-containers':
85modules/profile/manifests/calico/builder.pp: git::clone { 'operations/calico-cni':
86modules/profile/manifests/calico/builder.pp: git::clone { 'operations/calico-k8s-policy-controller':
87modules/profile/manifests/openstack/base/nodepool/service.pp: git::clone { 'integration/config':
88modules/profile/manifests/wmcs/tenants/libraryupgrader.pp: git::clone {'labs/libraryupgrader':
89modules/profile/manifests/performance/site.pp: git::clone { 'performance/docroot':
90modules/profile/manifests/kubernetes/deployment_server.pp: git::clone { 'operations/deployment-charts':
91modules/profile/manifests/docker/builder.pp: git::clone { 'operations/docker-images/production-images':
92modules/profile/manifests/grafana.pp: git::clone { 'operations/software/grafana/simple-json-datasource':
93modules/toollabs/manifests/images.pp: git::clone { 'operations/docker-images/toollabs-images':
94modules/toollabs/manifests/composer.pp: git::clone { 'composer':
95modules/puppetmaster/manifests/gitprivate.pp: git::clone { 'operations/private':
96modules/puppetmaster/manifests/gitclone.pp: git::clone { 'labs/private':
97modules/puppetmaster/manifests/gitclone.pp: git::clone {
98modules/puppetmaster/manifests/base_repo.pp: git::clone { 'operations/puppet':
99modules/testreduce/manifests/init.pp: git::clone { 'mediawiki/services/parsoid/testreduce':
100modules/wikilabels/manifests/web.pp: git::clone { 'wikilabels-wikimedia-config':
101modules/geowiki/manifests/init.pp: git::clone { 'geowiki-scripts':
102modules/geowiki/manifests/private_data.pp: git::clone { 'geowiki-data-private':
103modules/phragile/manifests/init.pp: git::clone { 'phragile':
104modules/phragile/manifests/init.pp: git::clone { 'composer':
105modules/wikistats/manifests/init.pp: git::clone { 'operations/debs/wikistats':

And then you'll have to add all the things that are deployed via scap.

https://phabricator.wikimedia.org/source/tool-ci/browse/master/build_table.py;3ed885277a38ddb5c7c8a6c3c9b666fb7b13ae10$49 is how my tools currently get a list of Wikimedia deployed extensions.

Some of the repositories listed in P6909 are only deployed in Cloud Services (e.g. extdist), not production.

greg triaged this task as Medium priority.Apr 23 2018, 4:41 PM

I made a bunch of updates today on https://www.mediawiki.org/w/index.php?title=Developers/Maintainers&action=history and corresponding extension wiki home pages, however help with sorting out these items is welcome:

Deployed on WMF servers according to https://phabricator.wikimedia.org/diffusion/MREL/browse/master/make-wmf-branch/config.json and https://phabricator.wikimedia.org/source/mediawiki-config/browse/master/wmf-config/extension-list but the extension wiki page does not use {{TNT|OnWikimedia}} and it is not listed on https://www.mediawiki.org/wiki/Developers/Maintainers :

  • extensions/CongressLookup
  • extensions/FileExporter
  • extensions/intersection
  • extensions/JADE
  • extensions/LdapAuthentication
  • extensions/PropertySuggester
  • extensions/WikimediaBadges

Extension wiki page uses {{TNT|OnWikimedia}} and it is listed as deployed on https://www.mediawiki.org/wiki/Developers/Maintainers but not in https://phabricator.wikimedia.org/diffusion/MREL/browse/master/make-wmf-branch/config.json and https://phabricator.wikimedia.org/source/mediawiki-config/browse/master/wmf-config/extension-list :

Extension wiki page uses {{TNT|OnWikimedia}} but not listed as deployed on https://www.mediawiki.org/wiki/Developers/Maintainers but not in https://phabricator.wikimedia.org/diffusion/MREL/browse/master/make-wmf-branch/config.json and https://phabricator.wikimedia.org/source/mediawiki-config/browse/master/wmf-config/extension-list :

  • Extension:DataModel
  • Extension:DataTypes
  • Extension:DataValues
  • Extension:DataValuesCommon
  • Extension:DataValuesInterfaces
  • Extension:Diff
  • Extension:FundraisingEmailUnsubscribe
  • Extension:ValueView

It's mediawiki/extensions/intersection.

Extension wiki page uses {{TNT|OnWikimedia}} but not listed as deployed on https://www.mediawiki.org/wiki/Developers/Maintainers but not in https://phabricator.wikimedia.org/diffusion/MREL/browse/master/make-wmf-branch/config.json and https://phabricator.wikimedia.org/source/mediawiki-config/browse/master/wmf-config/extension-list :

  • Extension:DataModel
  • Extension:DataTypes
  • Extension:DataValues
  • Extension:DataValuesCommon
  • Extension:DataValuesInterfaces
  • Extension:Diff
  • Extension:ValueView

These are all PHP libraries that should be in [[Category:PHP libraries]], not listed as MediaWiki extensions afaik. Someone more familiar with Wikidata should confirm.

@Aklapper, I've been spending a bit of time on this and one of the directions I'm leaning is to have the ROO be an output of the deployment review process (Review Queue today). My thought being that the ROO could be used by deployers as an authoritative source to determine whether or not a new extensions, services, etc... should be deployed. I spoke a to Tyler a bit about this and it seems like it would be of value.

What are some other use cases for the ROO? One thing that's mentioned is it's need to be machine readable. What systems would you see consuming this list? FWIW, I too think it should be machine readable. I only ask to tease out some of the use cases that I may not be aware of.

On a related topic, I'm planning to spin up work around augmenting/revamping the review process.

What systems would you see consuming this list?

All this still won't fix outdated information per se but I'd say it's less likely to have outdated information when there is one place to update instead of several places. My latest example was two days ago trying to find out who to fix https://phabricator.wikimedia.org/T203427 : I went to mw:Developers/Maintainers which listed "Discovery" for "Kartographer". On the Discovery IRC channel I was told that's not correct anymore (was corrected here).

Quiddity mentioned this in Unknown Object (Task).Oct 25 2018, 7:04 AM

Recently had a discussion with @mark and @faidon about some work that SRE is planning to do. Specifically the creation of a Service Catalog. On the surface this seems like it would potentially become the ROO mentioned in this task.

Ladsgroup renamed this task from Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites. to Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites.Apr 3 2023, 7:29 PM

Posting this draft comment that I wrote at some point:

For MediaWiki extensions:

  • For the next deployment, the reference of the repositories we deploy is the tool we use to cut the wmf branches . That was in Gerrit as mediawiki/tools/release.git in the file make-wmf-branch/config.json.
  • For the currently deployed version, the tool above registers the extensions as submodule of the mediawiki/core wmf branch for the week. Hence we can get the list by looking at .gitmodule for each wmf branch.

The problem is that we can branch an extension weekly but not deploy it. Jade got branched constantly until it got archived and never saw production. There is a extension-list file in mw config which can be used instead I think.

I did create some hacky code that may help, it queries the API for each wiki to determine deployment status of skins/extensions. https://gitlab.wikimedia.org/acooper/extusage. There is a dump from the historical data in october here: https://docs.google.com/spreadsheets/d/1SBU6sPHSrkWmxLbMaUu1WoEVEPVQTmSHUgo_DVT7c4c/edit?usp=sharing

The main gap I am currently aware of is fundraising tech and any other wiki that does not allow non-authenticated users to access this data (which is rare I think).

The Security Team is also thinking about and prioritizing a public dashboard containing this information together with missing security controls per repo.

Beyond extensions/skins, we don't currently have a solution to tracking all deployed code like microservices/k8s. I am not sure if that is part of the requirement for this ticket.

Beyond extensions/skins, we don't currently have a solution to tracking all deployed code like microservices/k8s. I am not sure if that is part of the requirement for this ticket.

Thanks for sharing. There is some discussion spinning up in DX regarding developing a "Service Catalog". It's in very early discussions, but it might make sense to collaborate a bit on this.

Thank you both! Let me know if I can help on anything, eventually we would love to have a catalogue of python services to audit for security issues and do updates semi-automatically.

Something to consider, there is a lot of overlap between this ticket and SBOMs that are already better standardized and even there are tools to build them.
https://www.cisa.gov/sbom

http://federalregister.gov/documents/2021/05/17/2021-10460/improving-the-nations-cybersecurity

We probably could repurpose this to build some SBOMs and then see what's missing.

hashar closed this task as Resolved.EditedJan 13 2025, 3:19 PM
hashar claimed this task.

Since for various projects I need:

  • a canonical and machine readable list of MediaWiki extensions and skins deployed on production (or about to be deployed to production)
  • having a wmf branch being cut is a prerequisite to have the code shipped to production
  • Release-Engineering-Team owns the process

I am hereby decreeing the configuration used to cut the branch to be the canonical list.

Web view: https://gitlab.wikimedia.org/repos/releng/release/-/blob/main/make-release/settings.yaml
Raw file: https://gitlab.wikimedia.org/repos/releng/release/-/raw/main/make-release/settings.yaml

While that covers MediaWiki repos, we need a machine readable catalog for all deployed projects, e.g. in the example above repos deployed via puppet are exampled: T190891#4086712 This ties into using SBOMs and being able to quickly find if a CVE affects production plus easing maintenance of the codesearch indexing which currently is a mess. So that is covered for your usecase but I don't think this is fully addressed yet.

While that covers MediaWiki repos, we need a machine readable catalog for all deployed projects, e.g. in the example above repos deployed via puppet are exampled: T190891#4086712 This ties into using SBOMs and being able to quickly find if a CVE affects production plus easing maintenance of the codesearch indexing which currently is a mess. So that is covered for your usecase but I don't think this is fully addressed yet.

Noteworthy: Blubber is capable of producing SBOMs (blubber docs) for software running through the deployment pipeline to production (though I'm unsure if it's used by anything at this moment).

While that covers MediaWiki repos, we need a machine readable catalog for all deployed projects, e.g. in the example above repos deployed via puppet are exampled: T190891#4086712 This ties into using SBOMs and being able to quickly find if a CVE affects production plus easing maintenance of the codesearch indexing which currently is a mess. So that is covered for your usecase but I don't think this is fully addressed yet.

Noteworthy: Blubber is capable of producing SBOMs (blubber docs) for software running through the deployment pipeline to production (though I'm unsure if it's used by anything at this moment).

Thanks. That's great news, I want to eventually start using SBOMs more broadly (to compliment debmonitor and other toolings we have). Here is the attempt to use adopt them in MediaWiki: T359634: Adopt Software Bill of Materials (SBOM) for MediaWiki but that's one side of it. Codesearch also needs a catalog of repos to index as well.

hashar subscribed.

@Aklapper Curious if we can move this ticket away from the Quality Engineering backlog - I don't see that there's action from my team on this and I'm trying to clean up that backlog in prep for incoming projects.

@SLong-WMF: Sure you can revert T190891#6665514. :) Would be very good though to have at least one team (or codebase) associated to an open task.
I can imagine that this is in the Release-Engineering-Team realm again.