Page MenuHomePhabricator

Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites
Open, MediumPublic

Related Objects

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Aklapper, please add any use cases that might be of interest to you. I'll do the same.

This is definitely not all, but one part of the puzzle is searching operations/puppet repo for "git::clone" to get this kind of list:

1modules/jupyterhub_old/manifests/init.pp: git::clone { $wheels_repo:
2modules/labs_vagrant/manifests/init.pp: git::clone { 'vagrant':
3modules/wikimetrics/manifests/base.pp: git::clone { 'wikimetrics-deploy':
4modules/wikimetrics/manifests/base.pp: git::clone { 'wikimetrics':
5modules/releases/manifests/init.pp: git::clone { 'mediawiki/core':
6modules/releases/manifests/init.pp: git::clone { 'mediawiki/tools/release':
7modules/scap/manifests/l10nupdate.pp: git::clone { 'mediawiki/core':
8modules/scap/manifests/l10nupdate.pp: git::clone { 'mediawiki/extensions':
9modules/scap/manifests/l10nupdate.pp: git::clone { 'mediawiki/skins':
10modules/scap/manifests/master.pp: git::clone { 'operations/mediawiki-config':
11modules/beta/manifests/autoupdater.pp: git::clone { 'beta-mediawiki-core':
12modules/beta/manifests/autoupdater.pp: git::clone { 'beta-portal':
13modules/beta/manifests/autoupdater.pp: git::clone { 'beta-mediawiki-extensions':
14modules/beta/manifests/autoupdater.pp: git::clone { 'beta-mediawiki-skins':
15modules/beta/manifests/autoupdater.pp: git::clone { 'mediawiki/vendor':
16modules/puppet_compiler/manifests/init.pp: git::clone { 'operations/puppet':
17modules/puppet_compiler/manifests/init.pp: git::clone { 'labs/private':
18modules/git/spec/defines/clone_spec.rb:describe 'git::clone' do
19modules/git/manifests/install.pp: git::clone{$title:
20modules/git/manifests/clone.pp:# Definition: git::clone
21modules/git/manifests/clone.pp:# git::clone { 'my_clone_name':
22modules/git/manifests/clone.pp:# git::clone { 'analytics/wikimetrics':
23modules/git/manifests/clone.pp:define git::clone(
24modules/vagrant/manifests/mediawiki.pp: git::clone { 'mediawiki/vagrant':
25modules/eventschemas/manifests/init.pp:# [*ensure*] Passed directly to git::clone. Default: latest.
26modules/eventschemas/manifests/init.pp: git::clone { 'mediawiki/event-schemas':
27modules/snapshot/manifests/cron/wikidatadumps/common.pp: git::clone { 'DCAT-AP':
28modules/statistics/manifests/discovery.pp: git::clone { 'wikimedia/discovery/golden':
29modules/statistics/manifests/wmde/wdcm.pp: git::clone { 'analytics/wmde/WDCM':
30modules/statistics/manifests/wmde/graphite.pp: git::clone { 'wmde/scripts':
31modules/statistics/manifests/wmde/graphite.pp: git::clone { 'wmde/toolkit-analyzer-build':
32modules/statistics/manifests/aggregator/projectview.pp: git::clone { 'aggregator_projectview_data':
33modules/statistics/manifests/compute.pp: git::clone { 'statistics_mediawiki':
34modules/statistics/manifests/sites/analytics.pp: git::clone { '':
35modules/statistics/manifests/sites/stats.pp: git::clone { 'wikistats-v2':
36modules/statistics/manifests/aggregator.pp: git::clone { 'aggregator_code':
37modules/service/manifests/deploy/gitclone.pp: git::clone { $repository:
38modules/jupyterhub/manifests/init.pp: git::clone { $deploy_repository:
39modules/extdist/manifests/init.pp: git::clone {'labs/tools/extdist':
40modules/extdist/manifests/init.pp: git::clone { 'integration/composer':
41modules/visualdiff/manifests/init.pp: git::clone { 'integration/visualdiff':
42modules/contint/manifests/phpunit.pp: git::clone { 'jenkins CI phpunit':
43modules/contint/manifests/slave_scripts.pp: git::clone { 'jenkins CI slave scripts':
44modules/contint/manifests/composer.pp: git::clone { 'jenkins CI Composer':
45modules/authdns/spec/classes/authdns_spec.rb: 'define git::clone($directory, $origin, $branch,$owner,$group) {}',
46modules/authdns/manifests/init.pp: git::clone { $workingdir:
47modules/reportupdater/manifests/init.pp: git::clone { 'analytics/reportupdater':
48modules/reportupdater/manifests/job.pp: git::clone { $repository_name:
49modules/noc/manifests/init.pp: git::clone { 'operations/software/dbtree':
50modules/tendril/manifests/init.pp: git::clone { 'operations/software/tendril':
51modules/quarry/manifests/base.pp: git::clone { 'analytics/quarry/web':
52modules/sentry/manifests/init.pp: git::clone { 'operations/software/sentry':
53Binary file modules/role/manifests/grafana/.base.pp.swp matches
54modules/role/manifests/labs/ores/staging.pp: git::clone { 'ores-wm-config':
55modules/role/manifests/labs/db/common.pp: git::clone { 'operations/mediawiki-config':
56modules/role/manifests/xhgui/app.pp: git::clone { 'operations/software/xhprof':
57modules/role/manifests/xhgui/app.pp: git::clone { 'operations/software/xhgui':
58modules/mediawiki_singlenode/manifests/init.pp: git::clone { 'vendor':
59modules/mediawiki_singlenode/manifests/init.pp: git::clone { 'mediawiki':
60modules/mediawiki_singlenode/manifests/mwextension.pp: git::clone { $name:
61modules/profile/manifests/ci/gitcache.pp: git::clone { 'operations/puppet':
62modules/profile/manifests/ci/gitcache.pp: git::clone { 'mediawiki/core':
63modules/profile/manifests/ci/gitcache.pp: git::clone { 'mediawiki/vendor':
64modules/profile/manifests/zuul/server.pp: git::clone { 'integration/config':
65modules/profile/manifests/analytics/refinery/source.pp: git::clone { 'refinery_source':
66modules/profile/manifests/switchdc.pp: git::clone { 'operations-switchdc':
67modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/rainbow':
68modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/twilightsparql':
69modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/prince':
70modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/wetzel':
71modules/profile/manifests/discovery_dashboards/production.pp: git::clone { 'wikimedia/discovery/wonderbolt':
72modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/rainbow':
73modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/twilightsparql':
74modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/prince':
75modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/wetzel':
76modules/profile/manifests/discovery_dashboards/development.pp: git::clone { 'wikimedia/discovery/wonderbolt':
77modules/profile/manifests/microsites/annualreport.pp: git::clone { 'wikimedia/annualreport':
78modules/profile/manifests/microsites/design.pp: git::clone { 'design/landing-page':
79modules/profile/manifests/microsites/design.pp: git::clone { 'design/style-guide':
80modules/profile/manifests/microsites/wikibase.pp: git::clone { 'wikibase/':
81modules/profile/manifests/microsites/transparency.pp: git::clone { 'wikimedia/TransparencyReport':
82modules/profile/manifests/microsites/transparency.pp: git::clone { 'wikimedia/TransparencyReport-private':
83modules/profile/manifests/microsites/research.pp: git::clone { 'research/landing-page':
84modules/profile/manifests/calico/builder.pp: git::clone{ 'operations/calico-containers':
85modules/profile/manifests/calico/builder.pp: git::clone { 'operations/calico-cni':
86modules/profile/manifests/calico/builder.pp: git::clone { 'operations/calico-k8s-policy-controller':
87modules/profile/manifests/openstack/base/nodepool/service.pp: git::clone { 'integration/config':
88modules/profile/manifests/wmcs/tenants/libraryupgrader.pp: git::clone {'labs/libraryupgrader':
89modules/profile/manifests/performance/site.pp: git::clone { 'performance/docroot':
90modules/profile/manifests/kubernetes/deployment_server.pp: git::clone { 'operations/deployment-charts':
91modules/profile/manifests/docker/builder.pp: git::clone { 'operations/docker-images/production-images':
92modules/profile/manifests/grafana.pp: git::clone { 'operations/software/grafana/simple-json-datasource':
93modules/toollabs/manifests/images.pp: git::clone { 'operations/docker-images/toollabs-images':
94modules/toollabs/manifests/composer.pp: git::clone { 'composer':
95modules/puppetmaster/manifests/gitprivate.pp: git::clone { 'operations/private':
96modules/puppetmaster/manifests/gitclone.pp: git::clone { 'labs/private':
97modules/puppetmaster/manifests/gitclone.pp: git::clone {
98modules/puppetmaster/manifests/base_repo.pp: git::clone { 'operations/puppet':
99modules/testreduce/manifests/init.pp: git::clone { 'mediawiki/services/parsoid/testreduce':
100modules/wikilabels/manifests/web.pp: git::clone { 'wikilabels-wikimedia-config':
101modules/geowiki/manifests/init.pp: git::clone { 'geowiki-scripts':
102modules/geowiki/manifests/private_data.pp: git::clone { 'geowiki-data-private':
103modules/phragile/manifests/init.pp: git::clone { 'phragile':
104modules/phragile/manifests/init.pp: git::clone { 'composer':
105modules/wikistats/manifests/init.pp: git::clone { 'operations/debs/wikistats':

And then you'll have to add all the things that are deployed via scap.;3ed885277a38ddb5c7c8a6c3c9b666fb7b13ae10$49 is how my tools currently get a list of Wikimedia deployed extensions.

Some of the repositories listed in P6909 are only deployed in Cloud Services (e.g. extdist), not production.

greg triaged this task as Medium priority.Apr 23 2018, 4:41 PM

I made a bunch of updates today on and corresponding extension wiki home pages, however help with sorting out these items is welcome:

Deployed on WMF servers according to and but the extension wiki page does not use {{TNT|OnWikimedia}} and it is not listed on :

  • extensions/CongressLookup
  • extensions/FileExporter
  • extensions/intersection
  • extensions/JADE
  • extensions/LdapAuthentication
  • extensions/PropertySuggester
  • extensions/WikimediaBadges

Extension wiki page uses {{TNT|OnWikimedia}} and it is listed as deployed on but not in and :

Extension wiki page uses {{TNT|OnWikimedia}} but not listed as deployed on but not in and :

  • Extension:DataModel
  • Extension:DataTypes
  • Extension:DataValues
  • Extension:DataValuesCommon
  • Extension:DataValuesInterfaces
  • Extension:Diff
  • Extension:FundraisingEmailUnsubscribe
  • Extension:ValueView

It's mediawiki/extensions/intersection.

Extension wiki page uses {{TNT|OnWikimedia}} but not listed as deployed on but not in and :

  • Extension:DataModel
  • Extension:DataTypes
  • Extension:DataValues
  • Extension:DataValuesCommon
  • Extension:DataValuesInterfaces
  • Extension:Diff
  • Extension:ValueView

These are all PHP libraries that should be in [[Category:PHP libraries]], not listed as MediaWiki extensions afaik. Someone more familiar with Wikidata should confirm.

@Aklapper, I've been spending a bit of time on this and one of the directions I'm leaning is to have the ROO be an output of the deployment review process (Review Queue today). My thought being that the ROO could be used by deployers as an authoritative source to determine whether or not a new extensions, services, etc... should be deployed. I spoke a to Tyler a bit about this and it seems like it would be of value.

What are some other use cases for the ROO? One thing that's mentioned is it's need to be machine readable. What systems would you see consuming this list? FWIW, I too think it should be machine readable. I only ask to tease out some of the use cases that I may not be aware of.

On a related topic, I'm planning to spin up work around augmenting/revamping the review process.

What systems would you see consuming this list?

All this still won't fix outdated information per se but I'd say it's less likely to have outdated information when there is one place to update instead of several places. My latest example was two days ago trying to find out who to fix : I went to mw:Developers/Maintainers which listed "Discovery" for "Kartographer". On the Discovery IRC channel I was told that's not correct anymore (was corrected here).

Quiddity mentioned this in Unknown Object (Task).Oct 25 2018, 7:04 AM

Recently had a discussion with @mark and @faidon about some work that SRE is planning to do. Specifically the creation of a Service Catalog. On the surface this seems like it would potentially become the ROO mentioned in this task.

Ladsgroup renamed this task from Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites. to Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites.Apr 3 2023, 7:29 PM

Posting this draft comment that I wrote at some point:

For MediaWiki extensions:

  • For the next deployment, the reference of the repositories we deploy is the tool we use to cut the wmf branches . That was in Gerrit as mediawiki/tools/release.git in the file make-wmf-branch/config.json.
  • For the currently deployed version, the tool above registers the extensions as submodule of the mediawiki/core wmf branch for the week. Hence we can get the list by looking at .gitmodule for each wmf branch.

The problem is that we can branch an extension weekly but not deploy it. Jade got branched constantly until it got archived and never saw production. There is a extension-list file in mw config which can be used instead I think.

I did create some hacky code that may help, it queries the API for each wiki to determine deployment status of skins/extensions. There is a dump from the historical data in october here:

The main gap I am currently aware of is fundraising tech and any other wiki that does not allow non-authenticated users to access this data (which is rare I think).

The Security Team is also thinking about and prioritizing a public dashboard containing this information together with missing security controls per repo.

Beyond extensions/skins, we don't currently have a solution to tracking all deployed code like microservices/k8s. I am not sure if that is part of the requirement for this ticket.

Beyond extensions/skins, we don't currently have a solution to tracking all deployed code like microservices/k8s. I am not sure if that is part of the requirement for this ticket.

Thanks for sharing. There is some discussion spinning up in DX regarding developing a "Service Catalog". It's in very early discussions, but it might make sense to collaborate a bit on this.

Thank you both! Let me know if I can help on anything, eventually we would love to have a catalogue of python services to audit for security issues and do updates semi-automatically.

Something to consider, there is a lot of overlap between this ticket and SBOMs that are already better standardized and even there are tools to build them.

We probably could repurpose this to build some SBOMs and then see what's missing.