Page MenuHomePhabricator

Find out whether / how to measure developer activity in Wikimedia code repositories not hosted in Wikimedia Git/Gerrit but on GitHub
Closed, DeclinedPublic

Description

(In the context of T160430)

Wikimedia has some Wikimedia's "GitHub-only" code repositories.
Find out how to differentiate (exclude) those repositories that are mirrors-only (maybe there no trivial way).
Also, open question: What about stuff that Wikimedia forked? Exclude or not? (Similar problem with measuring activity in pulled upstream repos in Gerrit)

Random links

Event Timeline

how to differentiate (exclude) those repositories that are mirrors-only

Link header response for WM repos on GitHub says &page=1964>; rel="last" so there are 1964 repos.
For a comparison of numbers:

$:acko\> ssh aklapper@gerrit.wikimedia.org -p 29418 gerrit ls-projects | wc -l
1736

Trying to somehow get a complete list of repos that are only mirrors there seems to be no one approach. Different spellings; only some mirrored repos have set the homepage key:

$:acko\> wget -q "https://api.github.com/orgs/wikimedia/repos?page=XY&per_page=100" -O fooXY.json
$:acko\> grep -r "Github mirror" . | wc -l
1594
$:acko\> grep -r "GitHub mirror" . | wc -l
2
$:acko\> grep -r "actual code is hosted" . | wc -l
1595
$:acko\> grep -r "\"homepage\": \"https://gerrit.wikimedia.org" . | wc -l
575

Also some repos have no description at all:

$:acko\> grep -r "\"description\": null" . | wc -l
68
$:acko\> cat fooXY.json | jq '.[] | select(.description == null) | .name'       ## list GitHub repos with empty description

In practical terms, to me this task should be block by another one "Create a list of Featured Projects for new developers" and see whether we have any GitHub only projects in that list.

If we have any GitHub only projects, then we can see whether having an export/mirror in Gerrit makes sense. If not, then we can check the metrics problem again.

In other words, I think putting time on this task before having Featured Projects is not a good use of time.

  • Created list of GitHub repositories (output from cat foo01.json | jq -r '.[] | .full_name > github.list' after concatenation)
  • Created list of Gerrit repositories (output from ssh aklapper@gerrit.wikimedia.org -p 29418 gerrit ls-projects > gerrit.list)
  • Stripped wikimedia/ prefix in github.list
  • Replaced / by - in gerrit.list
  • Sort entries in both files alphabetically
  • Ran diff and only show changed lines (via grep and nothing else)
  • - means in Gerrit only, + means in GitHub only:

1$:acko\> diff -dipu0 gerrit.list github.list | egrep '^[+-]'
2--- gerrit.list 2017-05-04 12:55:01.216117052 +0200
3+++ github.list 2017-05-04 12:43:44.289274339 +0200
4-3d2png-deploy
5+AFNetworking
6-All-Projects-In-Phabricator
7-All-Users
8-analytics-analytics.wikimedia.org
9+analytics-editor-geocoding
10+analytics-fundraising
11+analytics-fundraising-dashboard
12-analytics-kraken
13+analytics-quarry-puppet
14+analytics-query-service
15-analytics-websites_maintenance
16-analytics-wmde
17+analytics.wikimedia.org
18+android-mwlogin
19+AnimatedGIFImageSerialization
20+ansible-deploy
21+aosp-morelangs-ime
22+aosp-morelangs-ime-dictionaries
23+apps-android-commons
24-apps-ios-wikipedia
25+arc-lamp
26+arcanist
27+Assert
28+authoid
29+bingle
30+BlocksKit
31+bunyan-syslog-udp
32+camus
33+cassandra-codec
34+cassandra-metrics-collector
35+change-propagation
36+ChromeWikimediaDebug
37+citoid
38+citolytics
39-css-sanitizer
40+CocoaLumberjack
41+community-tech-tools
42+composer-merge-plugin
43+content-type
44+CopyPatrol
45+Cyberbot_II
46+data-warehouse
47+dClass
48+DeadlinkChecker
49+dump-scheduler-eval
50+dumpgrepper
51+eslint-config-node-services
52+eslint-config-wikimedia
53+extensions-Limn
54+FirefoxWikimediaDebug
55-gerrit
56+git-client-plugin
57+git-fat
58-graphs-shared
59-HtmlFormatter
60+grunt-banana-checker
61+grunt-stylelint
62+grunt-tyops
63+hpple
64+htcp-purge
65+html-formatter
66+html-metadata
67+html5depurate
68+htmldumper
69+hyperswitch
70+ifttt
71+incubator-cordova-android
72+incubator-cordova-ios
73+incubator-cordova-js
74-integration-gerrit-commit-message-validator
75-integration-phantomjs
76-integration-phpcs
77+java-morelangs
78+java-mwapi
79+jquery-client
80+jquery-tipsy
81+jquery.i18n
82+jquery.ime
83+jquery.uls
84+jquery.webfonts
85+jscs-preset-wikimedia
86+json-stable-stringify
87+KafkaSSE
88+kasocki
89+kraken
90+kraken-puppet
91-labs-incubator
92-labs-tools-quarrybot-enwiki
93-labs-tools-stashbot
94-labs-tools-Wikimedia-Emoji-Bot
95-labs-tools-wikiviewstats
96-mapdata
97+lcm-dashboard
98+limitation
99+limn
100+limn-data
101+limn-debugging-data
102+limn-deploy
103+limn-editor-engagement-data
104+limn-fundraising-data
105+limnpy
106+malu
107-maps-ClearTables
108-maps-meddo
109+MathJax
110+MathJax-node
111+mathoid
112-mediawiki-core
113+mediawiki-bots
114+mediawiki-bots-PHPWikiBot
115+mediawiki-containers
116+mediawiki-core-vendor
117+mediawiki-docker
118-mediawiki-extensions-3D
119+mediawiki-extensions-2ColConflict
120+mediawiki-extensions-AddMetaAndTitle
121+mediawiki-extensions-AmazonLookup
122-mediawiki-extensions-AutoGallery
123-mediawiki-extensions-BlueSpiceExtendedFilelist
124-mediawiki-extensions-BlueSpiceInsertTemplate
125-mediawiki-extensions-BlueSpiceMultiUpload
126-mediawiki-extensions-BlueSpiceSignHere
127-mediawiki-extensions-BlueSpiceSubPageTree
128+mediawiki-extensions-BSExtendedSearch
129+mediawiki-extensions-CategoryMagicWords
130-mediawiki-extensions-CategoryWatch
131+mediawiki-extensions-Censor
132+mediawiki-extensions-CentralNotice-BannerProxy
133+mediawiki-extensions-CloudSearch
134+mediawiki-extensions-ContextComments
135-mediawiki-extensions-CreatePageUw
136+mediawiki-extensions-CustomMagic
137+mediawiki-extensions-DataModel
138-mediawiki-extensions-DebateTree
139+mediawiki-extensions-DebianISOCodes
140+mediawiki-extensions-DetectLanguage
141+mediawiki-extensions-DisplayTitle
142+mediawiki-extensions-EImage
143+mediawiki-extensions-ELearnware
144+mediawiki-extensions-ExtensionStatus
145-mediawiki-extensions-FileExporter
146-mediawiki-extensions-FileImporter
147+mediawiki-extensions-FlashPlayer
148+mediawiki-extensions-GoogleTagManager
149+mediawiki-extensions-HaloTripleStoreConnector
150+mediawiki-extensions-HotCat
151-mediawiki-extensions-Ids
152-mediawiki-extensions-ImageRating
153+mediawiki-extensions-ImportBibliography
154+mediawiki-extensions-InterwikiExistence
155+mediawiki-extensions-Ipernity
156+mediawiki-extensions-Isbn
157+mediawiki-extensions-ISO3166
158+mediawiki-extensions-ISO639
159+mediawiki-extensions-JsonData-JsonSchema
160+mediawiki-extensions-KeepSearches
161-mediawiki-extensions-LdapGroups
162+mediawiki-extensions-LightboxGallery
163+mediawiki-extensions-ListTransclusions
164-mediawiki-extensions-MagicNumberedHeadings
165+mediawiki-extensions-MailChimpSubscription
166-mediawiki-extensions-MessageCommons
167+mediawiki-extensions-MetaDescriptionTag
168+mediawiki-extensions-MirrorTools
169+mediawiki-extensions-MobileSections
170-mediawiki-extensions-MoveToCommons
171-mediawiki-extensions-MoveToCommonsClient
172+mediawiki-extensions-MultiAudioVideo
173-mediawiki-extensions-NamespacePopups
174+mediawiki-extensions-Notifications
175+mediawiki-extensions-NoUnwrap
176-mediawiki-extensions-OrphanedTalkPages
177+mediawiki-extensions-PageCredits
178-mediawiki-extensions-PageLanguageApi
179-mediawiki-extensions-PageNameFormula
180-mediawiki-extensions-PagePopups
181+mediawiki-extensions-PDBHandler
182+mediawiki-extensions-PerformanceMonitor
183+mediawiki-extensions-PhpTagsDebugger
184-mediawiki-extensions-Pickle
185-mediawiki-extensions-PlanOut
186+mediawiki-extensions-PipVideoJs
187+mediawiki-extensions-PlaceNewSection
188+mediawiki-extensions-PrefixExport
189+mediawiki-extensions-PropertySuggester
190+mediawiki-extensions-ProtectedTitles
191+mediawiki-extensions-PubSubHubbubSubscriber
192+mediawiki-extensions-PurposeCentricSearch
193+mediawiki-extensions-QueryResult
194+mediawiki-extensions-RawImageHandler
195+mediawiki-extensions-RealNames
196+mediawiki-extensions-SemanticDummyEditor
197+mediawiki-extensions-ShareThisWidget
198+mediawiki-extensions-ShortUrlApi
199-mediawiki-extensions-ShowMe
200+mediawiki-extensions-ShrinkTheWeb
201+mediawiki-extensions-SignupAPI
202+mediawiki-extensions-SimpleSamlAuth
203+mediawiki-extensions-SMWEnrich
204+mediawiki-extensions-SMWHalo
205+mediawiki-extensions-SOLRSearch
206+mediawiki-extensions-StoryParagraph
207+mediawiki-extensions-SubpageWatchlist
208+mediawiki-extensions-SwedishCollation
209+mediawiki-extensions-TemplateDocumentation
210-mediawiki-extensions-TopLists
211+mediawiki-extensions-VersionView
212+mediawiki-extensions-ViewportMetrics
213+mediawiki-extensions-VisualWiki
214+mediawiki-extensions-WhatsNearby
215+mediawiki-extensions-WhichImageIsBetter
216+mediawiki-extensions-WikibaseClient
217+mediawiki-extensions-WikibaseLib
218+mediawiki-extensions-WikibaseView
219+mediawiki-extensions-WikiCortex
220+mediawiki-extensions-WikiEduDashboard
221+mediawiki-extensions-WikiFarm
222-mediawiki-extensions-WikimediaPageViewInfo
223+mediawiki-extensions-WikivotePageSchemas
224+mediawiki-extensions-WikivoyageMessages
225-mediawiki-libs
226-mediawiki-libs-Assert
227-mediawiki-libs-etcd
228-mediawiki-libs-RemexHtml
229-mediawiki-libs-ScopedCallback
230-mediawiki-libs-Timestamp
231-mediawiki-libs-WaitConditionLoop
232+mediawiki-libs-FileOgg
233+mediawiki-node-services
234-mediawiki-php-FastStringSearch
235-mediawiki-php-wikidiff
236-mediawiki-services-citoid
237-mediawiki-services-eventstreams-deploy
238-mediawiki-services-html5depurate
239-mediawiki-services-mathoid
240-mediawiki-services-ores-editquality
241-mediawiki-services-ores-wikiclass
242-mediawiki-services-parsoid
243+mediawiki-services-zotero
244-mediawiki-skins-Athena
245-mediawiki-skins-Material
246-mediawiki-skins-Poncho
247+mediawiki-title
248-mediawiki-tools-git-remote
249-mediawiki-tools-scap
250+metrics
251+MGSwipeTableCell
252-nfsd
253+nagf
254+node-rcstream
255+node-rdkafka
256+node-serviceworker
257+node-serviceworker-proxy
258+node-txstatsd
259+nodejs-driver
260+NSDate-Extensions
261+OAStackView
262+officeit-puppet
263-oojs-core
264+oojs-router
265-operations-calico-cni
266-operations-calico-k8s-policy-controller
267+operations-debs-apertium
268+operations-debs-cg3
269+operations-debs-contenttranslation-apertium-af-nk
270+operations-debs-contenttranslation-apertium-api
271-operations-debs-contenttranslation-apertium-arg
272-operations-debs-contenttranslation-apertium-arg-cat
273-operations-debs-contenttranslation-apertium-cat
274-operations-debs-contenttranslation-apertium-fra
275-operations-debs-contenttranslation-apertium-fra-cat
276+operations-debs-contenttranslation-apertium-hi-ur
277-operations-debs-contenttranslation-apertium-spa
278-operations-debs-contenttranslation-apertium-spa-arg
279-operations-debs-contenttranslation-apertium-spa-cat
280-operations-debs-contenttranslation-apertium-swe-dan
281-operations-debs-contenttranslation-apertium-swe-nor
282-operations-debs-contenttranslation-foma
283-operations-debs-contenttranslation-hfst-ospell
284+operations-debs-dropwizard-metrics
285+operations-debs-ffmpeg2theorawmf
286-operations-debs-geckodriver
287-operations-debs-libav
288+operations-debs-lttoolbox
289-operations-debs-mcrouter
290-operations-debs-mtail
291-operations-debs-openssl11
292-operations-debs-osmborder
293-operations-debs-phantomjs
294-operations-debs-prometheus-apache-exporter
295-operations-debs-prometheus-memcached-exporter
296-operations-debs-prometheus-redis-exporter
297-operations-debs-prometheus-snmp-exporter
298-operations-debs-pybal
299-operations-debs-pykube
300-operations-debs-python-confluent-kafka
301-operations-debs-python-mmh3
302-operations-debs-python-sprockets
303-operations-debs-python-sprockets-clients-statsd
304+operations-debs-python-statsd
305-operations-debs-python-ua-parser
306-operations-debs-StatsD
307+operations-debs-stud
308+operations-debs-txstatsd
309+operations-debs-yammer-metrics
310+operations-deployment
311-operations-docker-images
312+operations-docker-images-toolabs-images
313-operations-dumps-import-tools
314-operations-dumps-statusapi
315-operations-gerrit-plugins
316-operations-mediawiki-config-fonts
317-operations-mediawiki-multiversion
318-operations-puppet
319-operations-puppet-cdh
320+operations-puppet-cassandra
321-operations-puppet-jmxtrans
322-operations-puppet-kafka
323-operations-puppet-modules
324-operations-puppet-zookeeper
325-operations-software-cumin
326-operations-software-etcd-mirror
327+operations-software-grafana
328-operations-software-hhvm_exporter
329-operations-software-hhvm-dev
330+operations-software-mwprof
331+operations-software-mwprof-reporter
332-operations-software-prometheus_jmx_exporter
333-operations-software-statsdlb
334-operations-software-varnish-libvmod-header
335-operations-software-varnish-varnishkafka
336-operations-software-varnish-varnishkafka-testing
337-operations-software-xhprof
338-operations-switchdc
339-operations-wheels-paws-internal
340+parsoid
341+parsoid-dom-utils
342+peformance-docroot
343-performance-visualmetrics-docker
344-performance-WebPageTest
345-phabricator-arcanist
346-phabricator-phabricator
347+phantomjs
348+phlogiston
349+php-ffs
350+php-gpglib
351+piwik-sdk-ios
352+portals
353+preq
354+puppet
355+puppet-cdh
356+puppet-jmxtrans
357+puppet-kafka
358+puppet-kafka-0.7.2
359+puppet-storm
360+puppet-zookeeper
361+PyBal
362+python-diamond
363+pywikiapi
364+pywikibot-externals-six
365-rcstream
366+rescue-pxe
367-research-ores
368-research-ores-deploy
369-research-recommendation-api-scap
370+restbase
371+restbase-mod-table-cassandra
372+restbase-mod-table-spec
373+restbase-mod-table-sqlite
374+restevent
375+riemann-jmx
376+routeswitch
377+scap
378+ScopedCallback
379-search-ltr
380-search-MjoLniR
381-secrets
382+SelfSizingWaterfallCollectionViewLayout
383+service-runner
384+service-template-node
385+simplei18n
386-subversion-svn.wikimedia.org-mediawiki-trunk
387+sqoopy
388+SSDataSources
389+stashbot
390+stylelint-config-wikimedia
391+subversion
392+subversion-svn.wikimedia.org
393+subversion-svn.wikimedia.org-mediawiki
394+subversion-svn.wikimedia.org-trunk
395+swagger-js
396+swagger-router
397+swagger-ui
398+SWStepSlider
399+template-expression-compiler
400-testing-access-wrapper
401+testreduce
402+testrepo
403+texvcjs
404+thumbor
405+thumbor-base-engine
406+thumbor-conditional-sharpen
407+thumbor-djvu-engine
408+thumbor-exif-optimizer
409+thumbor-ghostscript-engine
410+thumbor-multi-handler
411+thumbor-page
412+thumbor-proxy-engine
413+thumbor-proxy-loader
414+thumbor-purger
415+thumbor-request-storage
416+thumbor-result-storage
417+thumbor-svg-engine
418+thumbor-tiff-engine
419+thumbor-video-engine
420+thumbor-video-loader
421+thumbor-vips-engine
422+thumbor-xcf-engine
423+Timestamp
424+tool-gridengine-status
425+TSMessages
426+TUSafariActivity
427+twcs
428+Tweaks
429+umapi_client
430+user_metrics
431+uxprototypes
432+varnishkafka
433+ve-dirtydiffbot
434+ve-needcheck-reporter-bot
435-VisualEditor-VisualEditor
436+visualmetrics-docker
437+WaitConditionLoop
438+web-html-stream
439+web-stream-util
440+Wiki-Class
441+WikiFont
442+wikimedia-analytics-wikimetrics-deploy
443-wikimedia-communications-WP-Victor
444-wikimedia-fundraising-civicrm
445+wikimedia-fundraising-dashboard
446-wikimedia-fundraising-process-control
447+wikimedia-IPSet
448+wikimedia-logo
449-wikimedia-portals
450-wikimedia-textcat-demo
451+wikimedia-thumbor-djvu-engine
452+wikimedia-thumbor-ghostscript-engine
453+wikimedia-thumbor-multi-handler
454+wikimedia-thumbor-page
455+wikimedia-thumbor-proxy-loader
456+wikimedia-thumbor-request-storage
457+wikimedia-thumbor-tiff-engine
458+wikimedia-thumbor-video-loader
459+wikimedia-thumbor-xcf-engine
460+wikimedia-whatcanidoforwikipedia
461-wikimedia-WikimediaShopTools
462-wikimedia-wlm-api
463+wikimedia.github.io
464+wikimediablog-wordpresscom
465+WikimediaUI-base
466+WikimediaUI-Style-Guide
467-wikipedia-gadgets
468+wikipedia-ios
469+wikipedia-iphone
470+WikipediaMobileJ2ME
471+WikisourceMobile
472-winter
473+WiktionaryMobile
474+WLMMobile
475+wpt-reporter
476+www.wikipedia.org
477+xhprof
478+YapDatabase
479+zotero
480$:acko\>

In other words, I think putting time on this task before having Featured Projects is not a good use of time.

True that. My curiosity was too strong though. :)

"Not a good use of time" is not an appropriate expression for this task. My apologies! :)

I just wanted to clarify how much we need this answer and under which circumstances. Regardless, I am reading with curiosity too. It is an interesting problem.

For my records, Mukunda pointed out that mirrored Gerrit projects are listed on https://phabricator.wikimedia.org/r/

I'm going to decline this task for the time being. ("decline" because of the "how" in the summary. Could also be "resolved" because of the "whether" which is "no").
While I found out a few interesting things this task won't move forward due to the current setup. See the dependency tasks in T163576#3232491 which would need to get fixed first to get a basic grip here to move forward.
This task can always be reopened once it's less of a mess and requires less complex manual work to get a basic grip.

Would T109939 really be the easy way here? It seems very fragile. Surely reading the master configuration from Phabricator or gerrit or whatever does the mirroring is superior?

Would T109939 really be the easy way here? It seems very fragile. Surely reading the master configuration from Phabricator or gerrit or whatever does the mirroring is superior?

@Tgr: "whatever does the mirroring": If I understand correctly, replication from Gerrit to Github is done by a Gerrit "replication" plugin. wikitech:Gerrit implies that plugin is com.googlesource.gerrit.plugins.replication (upstream code location). Which brought me to T109939...