Page MenuHomePhabricator

Improve the per-programming-language listings for our tools
Closed, ResolvedPublic

Description

Newcomers often ask "What can I work on in language foo". We should have easier way(s) for them to find out.

Most of our extensions are in PHP/JavaScript, so it's not worth adding anything specifically in the {{extension}} infoboxes.

Instead, we should go through the other tools, and make sure they are listed in the existing (or new) per-language sections and pages.
E.g. mw:Manual:Pywikibot should be linked from mw:Python or mw:API:Client_code or similar.

Potential targets:

Likely sources of info:

Aims:

  • Minimize outdated-ness
    • Minimize redundancy (so there are only a 1 or 2 locations to manually check/update, and to send newcomers to)
    • Maximize automation (using openhub or github listings)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I was looking at that, but it seems to be:

However, possibly updating that site, and just providing a few onwiki links to quick searches like the above (for various languages), would be the best way to resolve this task? Definitely an option. :-)

What about offering a list of categories to be added in relevant pages?

For instance [[Category:Python]] would be featured at https://www.mediawiki.org/wiki/Python. (I believe it is still not possible to transclude the list of pages in a category, or is it?)

  • slightly inaccurate (e.g. it lists Huggle as "Visual Basic", but Huggle has been in C++ and Python for a while),

Anyone can fix the repositories: https://www.openhub.net/p/huggle/enlistments

RobLa-WMF added subscribers: brion, tstarling.

I'm adding this to our queue in TechCom-RFC, since @tstarling , @brion, and I agreed it would be a good conversation for us to have, and we think it /might/ be an appropriate conversation to have at E224.

@Quiddity - would you be available for this week's ArchCom RFC office hour (E226)? We'd love to use this week as an opportunity to take a very informal survey of:

  1. What languages are we using?
  2. What languages should we use?
  3. What languages should we avoid?

That said, it'd be "informal"; we won't try to agree on any answers to those questions. The main goal would be to use this as an opportunity to identify which languages need clearer documentation, and hopefully help you prioritize your work in the coming quarter. Would this be helpful?

It'd also be nice to distinguish a few things:

  • are we talking about source languages, or runtime environments?
  • is a main aim "accessibility to new informal contributors from the internet", or "accessibility for debugging/patching for the Wikimedia Foundation ops staff in case of production meltdown" or something else?
  • are there any particular issues with particular languages/runtimes with regards to safety, reliability, performance, memory footprint, etc or are we mainly concerned with the above issues?
  • are there any distinctions based on how a tool or service is used? how critical it is to production; how it gets started/managed, how it gets maintained?

Two major examples recently:

There was a discussion on the adopting of a Java-based service (htmldepurate) for production use, which would replace the C-based tidy utility (not sure if we shell out or use a PHP extension now). Lots of concerns about memory usage, performance, and ability of ops to debug & manage the service came up, as well as long-term questions of how we should manage services.

There was also a discussion on the adoption of a rust-based utility for converting from an older browser automation protocol to a newer one for Firefox browser testing. While rust compiles to native code with no external runtime requirements, the packaging was a major concern as the default build uses the cargo dependency manager, which is kind of like rust's version of npm in that it pulls stuff from the internet to build by default.

I didn't pay much attention to the details of either conversation, so am not sure the outcomes, but it's probably worth looking at those conversations in more detail and involving the ops folks in any such discussion that affects what we run in production.

@RobLa-WMF yes, I can be there. However I think we're mixing 2 different issues...
Re: the discussions about WMF official support-levels for various languages/etc, that should probably be a different task? I was just mentioning this task as a tangent, in the ops thread.

@brion This task is intended to be about "accessibility to new informal contributors from the internet" - I want to help newcomers more easily find projects they can help with, based on their preexisting knowledge of a programming-language.

@RobLa-WMF yes, I can be there. However I think we're mixing 2 different issues...
Re: the discussions about WMF official support-levels for various languages/etc, that should probably be a different task? I was just mentioning this task as a tangent, in the ops thread.

@brion This task is intended to be about "accessibility to new informal contributors from the internet" - I want to help newcomers more easily find projects they can help with, based on their preexisting knowledge of a programming-language.

Thanks -- my inclination is to decouple this from RfC then, if this is meant to be a survey of available projects for contributors.

@RobLa-WMF yes, I can be there.

Wonderful! Let's make that the plan, then!

However I think we're mixing 2 different issues...
Re: the discussions about WMF official support-levels for various languages/etc, that should probably be a different task? I was just mentioning this task as a tangent, in the ops thread.

Yeah, I admit to seizing on this task as an opportunity to continue that other conversation. As you pointed out in that thread, having good documentation of the status quo is an important prerequisite to then answering the questions in T136866#2428055 and T136866#2430698

So, let's focus tomorrow's conversation on helping newcomers identify "What can I work on in language foo" or "I'm really great with language foo, bar, and baz; anything for me to help out with?" @Quiddity, will that help you with this task?

Qgil triaged this task as Low priority.Jul 5 2016, 9:05 PM

Assuming Low priority, better than no priority. Please adjust if needed.

RobLa-WMF raised the priority of this task from Low to Medium.Jul 5 2016, 10:26 PM

Given getting this documented is an important prerequisite to then answering the questions in T136866#2428055 and T136866#2430698, I'd like to raise the priority to at least normal.

Are only strictly programming languages supposed to be listed?
I would be in favour of listing of all computer languages we use, which would also include

  • CSS
    • CSS preprocessor
  • HTML
  • XML
  • SQL
  • YAML
  • Puppet uses its own language, IIANM
  • ...

Also, when listing languages, of which we use some known frameworks, I would add these frameworks as sublist. Also templating languages.

In a nutshell, we should have all used languages listed on some kind of index page and then have subpages with details for each. The amount of details would vary per each language based on their needs...

The general motivation for my involvement in this was a debate on the operations list about which programming languages should be supported on our production cluster. That said, that's not the only question people wanting to be involved will likely encounter. More inline:

Are only strictly programming languages supposed to be listed?

On the Programming languages page, yes. That doesn't preclude anyone from creating a different page with a more exhaustive list of computer technologies with dedicated grammars. Both pages would likely be useful.

I would be in favour of listing of all computer languages we use, which would also include [CSS, HTML, XML, SQL, Puppet, etc]

Where would you recommend putting this list?

So, let's focus tomorrow's conversation on helping newcomers identify "What can I work on in language foo" or "I'm really great with language foo, bar, and baz; anything for me to help out with?" @Quiddity, will that help you with this task?

Sounds good.

"To add to Quiddity's observation from yesterday: "So, let's focus tomorrow's conversation on helping newcomers identify "What can I work on in language foo"" I'd explore how and in what programming languages ... and especially in Wikidata/Wikibase" re today's Wikimedia-Office hour at 14:30

Outcomes of the meeting, from my perspective:

  • I will experiment with some large tables/lists, to see what layout and details make the most sense. Probably at https://www.mediawiki.org/wiki/Programming_languages. I'll update this task with a request for feedback, in a few weeks, once those are ready.
  • I will examine the other existing pages, and try to add some (existing (preferred) or new) navboxes to aid in their discovery.
  • I will do some minor updating of existing pages.
  • I will contemplate categorization, as an alternative solution, if the table idea appears impractical, or too hard to maintain or too hard to read.
  • I will not focus on openhub/github for now, but might provide search links. Will re-evaluate this, later on.

(and replace I with I/we as desired. :)

We had a great conversation on this topic in E226: ArchCom RFC Meeting W27: per-programming-language listings for our tools (2016-07-06, #wikimedia-office) The full log of that conversation: P3350.

We spent the first part of the conversation talking about where the entry points are, and some of the offsite tools that make it possible to see what programming languages that are in use in the MediaWiki code base. @Legoktm suggested "so I think the first step would be taking the list at https://www.mediawiki.org/wiki/Upstream_projects#Invented_Here and others and sorting them by programming languages into a table or something?". @Smalyshev suggested using categories or something resistant to neglect.

I then asked "if some developer (WMF included) wanted to deploy an extension involving non-PHP code, what would they need to consider?", which I then clarified "server side extension". The initial reaction was that MediaWiki extensions are PHP, but we started exploring deployment deviations. The conversation veered into the complexity of making Puppet manifests for things to deploy. I whined about mw:Puppet being a dead link (thanks @Quiddity for fixing that!). We then talked a little bit about the distinction between wikitech.wikimedia.org and mediawiki.org, and wrapped up with a an aside about "contributors" being more than just "people with Gerrit commit access".

RobLa-WMF moved this task from Inbox to Watching on the TechCom board.

Qgil moved this task from Backlog to July on the Developer-Relations (Jul-Sep-2016) board.

@Quiddity: Would you like to move this task to September on the workboard? Or punt to Oct-Dec?

What needs to happen to consider this Developer-Advocacy quarterly goal done? Is this list still good, and has there been any progress?

Outcomes of the meeting, from my perspective:

  • I will experiment with some large tables/lists, to see what layout and details make the most sense. Probably at https://www.mediawiki.org/wiki/Programming_languages. I'll update this task with a request for feedback, in a few weeks, once those are ready.
  • I will examine the other existing pages, and try to add some (existing (preferred) or new) navboxes to aid in their discovery.
  • I will do some minor updating of existing pages.
  • I will contemplate categorization, as an alternative solution, if the table idea appears impractical, or too hard to maintain or too hard to read.
  • I will not focus on openhub/github for now, but might provide search links. Will re-evaluate this, later on.

(and replace I with I/we as desired. :)

What needs to happen [...]

I need to clone myself.

:)

The idea behind identifying some tasks as individual quarterly goals is to set a higher priority and a timeline for them. I know theory and practice frequently collide, but if I can help stopping other work that Can Wait a couple of weeks, let me know.

I wonder whether @srishakatux could help here at some point, since the goal of this task is ultimately to help newcomers find a task or project to work on, based in their skills.

Summary: we could reactivate this task by connecting it to Onboarding New Developers and the Technical Collaboration Guidance.

As we have seen, this is a huge task. In order to have some tangible progress, I propose to start addressing the parts that can help our upcoming Onboarding New Developers program. Also our work on the Technical Collaboration Guidance. How?

  • I have suggested the need to define in more detail the expectations for project information pages. One possibility could be that those project infor pages would feature an infobox (that we designed somewhere, need to find it), that infobox would include a field for programming languages involved, that would add the corresponding categories to that page.
  • Featured Projects (a new concept brought by the Onboarding New Developers program) could be required to follow these expectations for their landing page. Because they are looking for new contributors, specifying the programming languages involved is essential. If we do it in the right way, then contributors looking for i.e. Python projects would be able to get a list of the Featured Projects available.

This would allow us to start somewhere with a clear short term output: if you want to become a Featured Project, you need to specify the programming languages of your project.

Unassigning myself for now, partially because the scope increased beyond what I predicted, and possibly in sensible ways. If nobody else picks it up in the meantime, I'll plan to revisit it in January (Q3) or April (Q4).

Qgil lowered the priority of this task from Medium to Low.Sep 19 2017, 9:13 AM

Toolhub toolinfo annotations and the evolving set of categories for browsing tools in Toolhub will also help in this area. Programming language is one of the proposed attributes to be included in the taxonomy, but in Toolhub, users can also curate their own lists of tools based on whatever criteria they want. See: T195681, T308030.

TBurmeister claimed this task.

I'm going to boldly close this due to the following completed work and evolution of the situation:

  1. People can browse repos by programming language in Github.
  2. Both the Developer Portal and mw:New_developers offer a curated list of projects that can welcome new contributors, by programming language or with programming language indicated in the project summary
  3. Toolhub supports annotations and curated lists, both of which can be used to group tools by programming language, though we decided not to make programming language part of the taxonomy / controlled vocabulary (justification for that is here). Example query for "python".

There is a larger issue of populating the list of projects on pages like mw:New_developers to be more robust, but that is beyond the scope of this task and is covered by other tasks like T312164. The issue mentioned in the original task description of how Extensions are documented is covered in T194714 and related tasks.