Page MenuHomePhabricator

Review and update GitHub mirror repo descriptions with standard text
Open, LowestPublic


As of Oct 2015, we have 111 repositories (out of 1604) without a link to's Developer_access page in their description:

1('wikipedia-iphone', 'An obsolete version of the Wikipedia iPhone app. Please use the current version: ')
2('WikipediaMobile', 'Wikipedia on Mobile (PhoneGap)')
3('WiktionaryMobile', 'Wiktionary on Mobile!')
4('limn', 'A GUI Visualization Toolkit')
5('WLMMobile', '')
6('WikipediaMobileJ2ME', 'Wikipedia Mobile for J2ME')
7('jquery-tipsy', 'Facebook-style tooltips plugin for jQuery')
8('incubator-cordova-js', 'Mirror of Apache Cordova js')
9('incubator-cordova-android', 'Mirror of Apache Cordova Android')
10('incubator-cordova-ios', 'Mirror of Apache Cordova iOS')
11('jquery.i18n', 'jQuery based internationalization library')
12('jquery.webfonts', 'jQuery based Webfonts library')
13('jquery.ime', 'jQuery based input methods library')
14('jquery.uls', 'Universal Language Selector ')
15('puppet-cdh', "Puppet module for Hadoop and the rest of Cloudera's CDH 5.")
16('limnpy', 'A library for creating [limn]( compatible datasources and datafiles')
17('sqoopy', 'Python CLI to generate custom sqoop import statements')
18('WikisourceMobile', '')
19('puppet-kafka-0.7.2', 'A Puppet module for installing and managing Apache Kafka brokers.')
20('android-commons', 'WARNING: MIGRATED TO and/or')
21('mediawiki-extensions-SemanticMediaWiki', 'Mirror of SMW')
22('WikipediaMobileFirefoxOS', 'Moved to')
23('kraken-puppet', 'Puppet Repository for the WMF Kraken Analytics cluster.')
24('puppet-storm', 'Puppet module to install and manage Storm.')
25('xhprof', 'XHProf: A Hierarchical Profiler for PHP — XHProf is a function-level hierarchical profiler for PHP and has a simple HTML based navigational interface. The raw data collection component is implemented in C (as a PHP extension). The reporting/UI layer is all in PHP. It is capable of reporting function-level inclusive and exclusive wall times, memory usage, CPU times and number of calls for each function. Additionally, it supports ability to compare two runs (hierarchical DIFF reports), or aggregate results from multiple runs.')
26('android-mwlogin', 'Android library for doing login / authentication')
27('java-mwapi', 'Easy access to Mediawiki API from Java')
28('dClass', 'Device Classification Engine')
29('limn-deploy', '')
30('java-morelangs', 'Port of the core functionality and data of jQuery.IME to Java')
31('aosp-morelangs-ime-dictionaries', 'Dictionaries for aosp-morelang-ime')
32('aosp-morelangs-ime', 'A fork of the AOSP Keyboard with more languages')
33('Commons-iOS', 'Migrated to')
34('mediawiki-vagrant', 'A virtual MediaWiki development environment, built on Vagrant, VirtualBox, and Puppet.')
35('kraken', 'Wikimedia analytics data services platform.')
36('riemann-jmx', 'Utility to send hbase metrics to Riemann')
37('integration-docroot', 'Github mirror of integration/docroot - our actual code is hosted at')
38('PyBal', 'PyBal is a LVS monitor. It monitors Squid or Apache servers and adapts LVS state based on the results.')
39('limn-data', 'Example datafiles, datasources, and graphs that Limn can work with')
40('limn-editor-engagement-data', 'Editor Engagement and Features Analysis Data in a format that Limn can understand.')
41('metrics', 'Deprecated -- Use')
42('user_metrics', 'Wikimedia Foundation E3 Team Analysis Code')
43('limn-fundraising-data', '')
44('limn-debugging-data', 'limn graphs and dashboards used to troubleshoot various analyses')
45('umapi_client', 'Client wrapper for Wikipedia User Metrics API.')
46('bingle', 'A tool to get bugs from Bugzilla and import them into Mingle and/or Trello.')
47('lcm-dashboard', 'Dashboard and data visualization for Language Coverage Matrix data')
48('puppet-zookeeper', 'Install and configures a Zookeeper client and/or server.')
49('puppet-jmxtrans', 'A Puppet module for jmxtrans.')
50('varnishkafka', 'Varnish log collector with Apache Kafka integration.')
51('camus', '')
52('puppet-kafka', 'Puppet module for Apache Kafka 0.8.')
53('analytics-statsd-ganglia', 'A plug-in backend for StatsD providing support for writing statistics to Ganglia')
54('restbase-mod-table-cassandra', 'Cassandra table storage backend for RESTBase')
55('mathoid', 'Using MathJax and PhantomJS to create SVGs and MathML server side with minimum overhead.')
56('wikipedia-ios', 'The official Wikipedia iOS client.')
57('labs-private', '"Private" repo for labs. It\'s not actually private, as you can see.')
58('git-fat', 'Simple way to handle fat files without committing them to git, supports synchronization using rsync')
59('grunt-banana-checker', 'Grunt checker for the Banana JSON i18n system provided by MediaWiki and jquery.i18n')
60('restbase', 'Distributed storage with REST API & dispatcher for backend services')
61('rcstream', 'Broadcast activity from MediaWiki wikis using')
62('htmldumper', 'HTML dump script for Parsoid HTML')
63('routeswitch', 'A regexp switch based, efficient router.')
64('MathJax', 'Beautiful math in all browsers')
65('MathJax-node', 'Mathjax for Node')
66('unicodejs', 'A library for working with the Unicode standard')
67('simplei18n', 'No frills internationalization engine for use with PHP projects.')
68('wikimediablog-wordpresscom', "Wikimedia blog theme, mirrored from WordPress's SVN")
69('preq', 'Yet another promising node-request wrapper')
70('git-client-plugin', 'Jenkins git client plugin')
71('nagf', 'Not another Graphite frontend.')
72('texvcjs', 'A LaTeX validator/translator for TeX strings embedded in wikitext')
73('cassandra-codec', 'Cassandra data encoding / decoding, in particular Java BigInteger varints and Decimals')
74('node-txstatsd', 'Modified version of for WMF specific txstatsd constraints.')
75('dumpgrepper', 'A MediaWiki XML dump grepper tool')
76('html-metadata', '')
77('composer-merge-plugin', 'Merge one or more additional composer.json files at Composer runtime')
78('swagger-router', 'A swagger 2 based router with support for assembling multiple APIs from swagger fragments. Developed in support of RESTBase.')
79('ChromeWikimediaDebug', 'Wikimedia debug extension for Chrome')
80('arc-lamp', "A set of scripts for getting useful performance data out of HHVM's Xenon extension.")
81('FirefoxWikimediaDebug', 'Wikimedia debug add-on for Firefox')
82('restbase-mod-table-sqlite', 'SQLite backend for RESTBase')
83('swagger-ui', 'Swagger UI is a dependency-free collection of HTML, Javascript, and CSS assets that dynamically generate beautiful documentation from a Swagger-compliant API.')
84('service-runner', 'Generic nodejs service supervisor')
85('service-template-node', 'Template for creating MediaWiki Services in Node.js')
86('utfnormal', 'Unicode normalization functions.')
87('Wiki-Class', 'A library for performing automatic detection of assessment classes of Wikipedia article text.')
88('service-mobileapp-node', 'Note: Development is happening on Gerrit now. The repo is synced to Sorry, the name is provided automatically by the sync script. This repo will be removed soon.')
89('ve-dirtydiffbot', 'Bot to look for dirty diffs by crawling random Wikipedia articles.')
90('jquery-client', 'User-agent detection.')
91('ifttt', 'Flask web app providing an IFTTT Channel Protocol API for featured content on Wikimedia wikis')
92('ansible-deploy', 'Ansible deploy utilities')
93('json-stable-stringify', 'deterministic JSON.stringify() with custom sorting to get deterministic hashes from stringified results')
94('at-ease', 'A safe alternative to PHP\'s "@" error control operator.')
95('AhoCorasick', 'A PHP implementation of the Aho-Corasick string search algorithm')
96('ve-needcheck-reporter-bot', 'Reports last 24 hours of ve-needcheck edits to #mediawiki-visualeditor channel')
97('cassandra-metrics-collector', 'Cassandra JMX metrics collector')
98('wikimedia-logo', '')
99('restbase-mod-table-spec', 'Shared spec and tests for RESTBase table storage')
100('grunt-tyops', 'A grunt task to check files for typos and fail if any are found')
101('', 'A fork of')
102('WrappedString', 'Compacting text containing redundant wrappers.')
103('phlogiston', 'A script to pull Phabricator task history out of MySQL and denormalize it for reporting')
104('swagger-js', 'Javascript library to connect to swagger-enabled APIs via browser or nodejs')
105('html5depurate', 'This is an HTTP frontend for the HTML 5 parser. It parses some input text and returns the reserialized HTML.')
106('CLDRPluralRuleParser', 'PHP library to parse CLDR plural rule syntax.')
107('htcp-purge', 'Node module to purge caches over the HTCP protocol, as implemented by Squid')
108('SelfSizingWaterfallCollectionViewLayout', "SelfSizingWaterfallCollectionViewLayout is a UICollectionViewLayout subclass that organizes items of dynamic height into a grid of variable columns (as if you're winning at Tetris upside-down). Designed for use alongside AutoLayout and iOS8 self-sizing cell technologies.")
109('community-tech-tools', 'Random scripts and tools used internally by the Community Tech team')
110('jscs-preset-wikimedia', 'JSCS preset for Wikimedia.')
111('bunyan-syslog-udp', 'A udp-only syslog stream for bunyan')

Should these be replaced with some standard text? Or should all our GitHub mirrors have similar descriptions (which is more useful for navigation)?

Since early 2017 GitHub descriptions can also have topics; adding those to the most important repos would be also nice.

Event Timeline

greg raised the priority of this task from to Medium.
greg updated the task description. (Show Details)
greg added a project: Wikimedia-GitHub.
greg added subscribers: greg, Krenair.

(I already went through a bunch of mediawiki-extensions- and wikimedia- ones which were obviously mirrors, I don't think most of these are mirrors though.)

demon lowered the priority of this task from Medium to Lowest.Sep 1 2016, 10:19 PM
demon added a subscriber: demon.

Isn't this kind of a dupe of T109939 which talks about standardizing them?

Do these all have descriptions in gerrit, or are we risking to override a useful description with a less useful one if we auto-update them?

It would be be nice to get stats of how many of these projects actually use Github, as opposed to just fooling contributors into believing they are using it. E.g. looks rather sad.

@Tgr: That sounds related to T163576 (if my interpretation of "using it" is similar)?

It would be be nice to get stats of how many of these projects actually use Github, as opposed to just fooling contributors into believing they are using it. E.g. looks rather sad.

This would be nice. We've always taken the assumption that we should replicate everything with Github. Maybe we shouldn't, and only replicate things that people care about (for a really generous definition of the word care). Would certainly simplify repo management :)

Since early 2017 GitHub descriptions can also have topics. Would be nice to add to our major software projects at least.

The gerrit admin page and the Phabricator repo page both utterly suck as a landing page. I think it would be better to use the description as it's normally intended, use the URL to link to the project page on (or leave it empty if there isn't one), and put everything else in CONTRIBUTORS (see T136863#3714110).