Page MenuHomePhabricator

Create an authoritative and well promoted catalog of Wikimedia tools
Open, MediumPublic

Assigned To
Authored By
Ricordisamoa
Oct 15 2015, 8:29 PM
Tokens
"Love" token, awarded by Quiddity."Love" token, awarded by Tgr."Like" token, awarded by waldyrious."Love" token, awarded by xSavitar."Love" token, awarded by Lluis_tgn."Like" token, awarded by Ricordisamoa."Love" token, awarded by He7d3r."Like" token, awarded by Elitre.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I can share what I've heard from the mostly-non-technical program organizers I work with about the problems that should be addressed.

Is the existiong catalog not well-known / well-promoted?

Yes. First, which existing catalog? The list of tools on tool labs, Hay's tool directory, or Magnus' remix of Hay's directory? From what I know, it seems Hay's is most well known, the remix seems best liked, but most people are not aware of any of them. It hurts to see people manually count/collect/track things when tools to do so already exist because they are unaware of them.

Does it have too little information about its entries?

Yes. I think part of the problem is that many tools have unanticipated uses by users unknown to the tool creator. For example, GLAMify is useful for GLAM campaigns, for follow ups to photo competitions such as Wiki Loves Earth, and for article improvement competitions. Wikimetrics is useful for counting number of pages created at an event but also survival of a cohort of editors.
Another part is that one person may say a tool is good for judging writing competitions and another may be looking for tools good for scoring editing contests, and the description and search will past like ships in the night.

Does it not include many tools people are looking for?

Yes. There are plenty of useful tools that aren't included on any of the existing lists. Some of these are the tools people are looking for, but I suspect there are more that are useful that people are not looking for, since the people do not know those tools exist.

I would like:

  • A way to add keywords, and improve the descriptions, for each tool. [problem#1]
  • Either a semi-formal ontology, or at least some sort of guidance/suggestions with keywords, to reduce some of the problems like plurals and synonyms. [problem#1]
  • A way to add tools myself, without having to sign-up for x different repositories/trackers.
  • A way to add/show screenshots, for us visual-thinkers, who don't remember what the tool was called, but do remember what it looks like. [Perhaps, via Extension:PageImages accessing any existing screenshots (which we'd encourage/add) in the primary documentation pages at mediawiki/wikitech?]
  • The ability to add tools that are not hosted on our infrastructure. But clearly marking them as "off-wikimedia". [to avoid privacy confusions]
  • The ability to add links to documentation about historic/defunct tools, so that people can at least learn about them. But clearly marking them as "not available". [cf. some great items in Atlasowa's page, linked in the task description.]
  • A link pointing to the sourcecode for each, and perhaps something indicating what license it is under.
  • A link pointing to a feedback page for each, to encourage wikilove from users.
  • A link to the where we (especially non-developers) can help with UI translation for each, if the tool is configured for that.
  • A pony. [tradition]

_
If we include Gadgets, it gets more complicated, because many are inherently limited to particular wikis (some by design, some by accident). Plus Gadgets as a whole will hopefully be further complexified soon, with "Global Gadgets". (Plus I'm still rooting for my proposed design upgrade.) But it would be nice to have them searchable, too.

_
n.b. I thought this page [that I stumbled upon] was starting to work out some of that, but I haven't dared experiment with it yet... https://wikitech.wikimedia.org/wiki/User:Magnus_Manske/hay_directory - or maybe that's for the remix directory?

_
problem#1:

  • e.g.1. From a search, there are 39 tools in Hay's directory that have "category", but only 22 have "categories" - if I type "categor" I get the union, 44.
  • e.g.2. Similar problems exist for "image", "images", "file", "files", "media", "multimedia", etc.

This class of problem is the sort of thing that Semantic MediaWiki, Wikibase, and Cargo seek to solve. All of those are potential solutions, but not likely to be quickly adopted on the Wikimedia production cluster.

But SMW is already installed on Wikitech; isn't that where tools are meant to be documented?

If the idea is that every tool (whether hosted on Tool Labs, or elsewhere) has a page on Wikitech (which seems a sensible idea to me), then would it not be reasonably straightforward to just use Semantic MediaWiki and PageForms to build a system of browsing and editing tools' metadata? It seems to me that the building blocks for such a thing are already in place.

This class of problem is the sort of thing that Semantic MediaWiki, Wikibase, and Cargo seek to solve. All of those are potential solutions, but not likely to be quickly adopted on the Wikimedia production cluster.

But SMW is already installed on Wikitech; isn't that where tools are meant to be documented?

T53642: Get rid of SemanticMediaWiki/SRF/SF from wikitech.wikimedia.org is something that we have been working towards over the last 6-9 months and actually expect to complete in the coming fiscal year. One reason for this is mentioned in T62886#2909198, modern SMW versions are dependent on Composer in a way that is not trivial to deploy on the Wikimedia production cluster.

I wonder whether the progress on Toolsadmin now supports creating and documenting tools and https://toolsadmin.wikimedia.org/tools/ are steps in the direction of "an authoritative and well promoted catalog of Wikimedia tools" or a separate development.

I wonder whether the progress on Toolsadmin now supports creating and documenting tools and https://toolsadmin.wikimedia.org/tools/ are steps in the direction of "an authoritative and well promoted catalog of Wikimedia tools" or a separate development.

They are at least steps towards making it easier for things on Toolforge to be documented and that documentation to be both collaboratively edited and shared. The sharing is current done with https://toolsadmin.wikimedia.org/tools/toolinfo/v1/toolinfo.json which is only a sub-set of the data that Striker can collect now, but a good start.

I volunteer to help with any kind of content needs for this project, whether it's creating descriptions, keywords, a taxonomy, etc.

I worked on this at my previous job, where we used .yml files for each repo to keep track of information:

https://github.com/openopps/openopps-platform/blob/dev/.about.yml - example
Example of tool directory: http://brigade.codeforamerica.org/brigade/projects
Example of tool directory: https://18f.gsa.gov/what-we-deliver/

Cool directories for external collections that @MelodyKramer showed me:

The interesting thing about both of these is that they appear to be curated in that there are use-case driven organization which is a bit different from the freeform tags of toolinfo.json.

I would be happy to assist in some attempt at this problem, but I do not have the free time to actually commit to doing anything close to the majority of the work.


This wish is still missing a reasonable description of what use cases need to be solved and in what order. As it stands now there is a very large problem space that has a few partial solutions, but no clear description of what would be better. That in my mind is the first part of the problem that needs to be tackled.

  • Hay's Directory solves a problem: it allows the developer of a tool to maintain a standardized collection of metadata that can be aggregated by a central system and displayed to others.
  • The toolinfo system in toolsadmin (Striker) solves a related problem: it allows the technical community to collaborate on adding tags and updating the description of a toolinfo.json record for a given tool hosted on Toolforge if the tool's maintainers have made an initial record.
  • Magnus' hay directory user page solves a similar problem to the one solved by toolsadmin: it allows the technical community (users with accounts on Wikitech) to add and edit toolinfo.json metadata in a central wiki page. This allows collaborative editing, but with a user interface that is somewhat lacking.
  • Neither toolsadmin nor Magnus' page help typical Wikimedia users who do not have a technical contributor account participate in creating or curating toolinfo records.
  • None of these solutions enforce a common taxonomy for tools.
  • None of these solutions are particularly good at answering human questions like "How can I do X?" or "What is the most powerful Y?"

Many good points have been made thus far about why a better system would be nice. There are also some well informed opinions here about how making Yet Another Thing to solve the problem is likely to fail without a larger social component. This does not seem to me like the kind of problem that can be solved purely with software. Its just as much a culture and time problem. Tool creators/maintainers need to want to advertise. Tool users need to want to find new/different solutions for their workflows. Everyone needs to want to keep the information up to date.

I recently listed and researched GLAM-, Commons- and Wikidata-oriented tools in the context of SDC General (see T180197: [Epic] Support needed changes to volunteer tools for Wikimedia Commons and Wikidata that will benefit from operating with structured data on Commons). FWIW, I also categorized these myself in order to be able to group them better by functionality and their place in general workflows on Commons and Wikidata. Categories I outlined are:

  • get source media / metadata
  • source data cleaning
  • matching with Wikidata
  • media upload
  • data upload
  • "enhance - categorization"
  • admin / moderation
  • curation / organization
  • bulk / quick editing
  • generate attribution
  • statistics (Commons)
  • statistics (Wikidata)
  • reuse / visualization
  • search

You can explore the entire tool spreadsheet here; might be helpful. https://docs.google.com/spreadsheets/d/1GVR0jghBWuAGqJaT7KVXigMYWWNzdnrnwI9nWqfJrCo/edit#gid=0

You can explore the entire tool spreadsheet here; might be helpful. https://docs.google.com/spreadsheets/d/1GVR0jghBWuAGqJaT7KVXigMYWWNzdnrnwI9nWqfJrCo/edit#gid=0

@SandraF_WMF , the spreadsheet is not "shared". I'm assuming you meant to turn on public visibility for that spreadsheet.

You can explore the entire tool spreadsheet here; might be helpful. https://docs.google.com/spreadsheets/d/1GVR0jghBWuAGqJaT7KVXigMYWWNzdnrnwI9nWqfJrCo/edit#gid=0

@SandraF_WMF , the spreadsheet is not "shared". I'm assuming you meant to turn on public visibility for that spreadsheet.

Thanks for the heads up! I have made the spreadsheet publicly accessible now.

Thanks for the heads up! I have made the spreadsheet publicly accessible now.

Really nice, thanks for sharing.

@Abit You may want to share the Sheet you created a few years back!

You may want to share the Sheet you created a few years back!

How have I not shared that here yet? This list is incomplete and mostly out of date, but I was interested in tools that program organizers used to manage, track, and measure their programs. It is here: https://docs.google.com/spreadsheets/d/1iUvZZStf8k6RYdJYl2DLIxRdtCgWSYzEay5cCQg8MyE/edit?usp=sharing

Ha, came here to add James :)

xSavitar updated the task description. (Show Details)
xSavitar edited projects, added Cloud-Services; removed Tools.

How the heck did cloud service remove itself by me trying to improve on the text in the task? :(. Adding it back.

Hmmm... This is weired, now the "Tools" project tag has been removed :(, why is this happening? Is it that "Tools" tag and "Cloud-Services" tag can't be on the same ticket? I didn't deliberately remove these tags, I just tried editing the task to improve on it then they get removed on their own? :(

<threadjack>

In T115650#3909826, @D3r1ck01 wrote:

How the heck did cloud service remove itself by me trying to improve on the text in the task? :(. Adding it back.

This is a Phabricator "feature" that is not obvious at all, but makes some sense once it is explained. The Cloud-Services project is an umbrella project with things like Toolforge, Cloud-VPS, and Tools as sub projects. This nesting can go down multiple additional levels (e.g. Cloud-VPS (Quota-requests) is a child of the Cloud-VPS project). When a child project is on a task it shows up in the search results (and workboards if the child project is also a milestone project) for all of the parent and grandparent projects as well. Phabricator only shows the most deeply nested child project on the task itself.
</threadjack>

Harej raised the priority of this task from Low to Medium.Feb 7 2018, 3:18 AM

Hey everyone, there's a page on Meta about Toolhub, https://meta.wikimedia.org/wiki/Toolhub

Of note, we have published a data model here: https://meta.wikimedia.org/wiki/Toolhub/Data_model. The data model is the list of different ways to describe each tool. Can you think of more ways tools can be described? What pieces of information help let you know that you've found the tool you're looking for?

Hey everyone, there's a page on Meta about Toolhub, https://meta.wikimedia.org/wiki/Toolhub

Of note, we have published a data model here: https://meta.wikimedia.org/wiki/Toolhub/Data_model. The data model is the list of different ways to describe each tool. Can you think of more ways tools can be described? What pieces of information help let you know that you've found the tool you're looking for?

Do you have a "Collect feedback around the data model page" kind of task? That'd be easier to point to people, add to Tech News, etc. Thanks.

T186382 I think would be the most relevant task.

What's up? (@Harej)

This project stalled out due to lack of software engineers to work on implementation. The Cloud Services team asked for Software Engineers in both the fiscal year 2017-2018 and 2018-2019 Wikimedia Foundation annual planning cycles, but did not win the "requisition number lottery" either time. I will be asking again in the 2019-2020 annual planning process for new staff to help build projects like this one and to take care of existing Cloud Services software projects that also have no assigned staff like Quarry and PAWS. Maybe the 3rd time I will find a way to be more persuasive. :)

I see, I wish you luck then :)

Quiddity removed a subscriber: MelodyKramer.

I am (slowly) getting work restarted on this project. Building a "minimum viable product" version of the Toolhub catalog is a fiscal year 2020-2021 goal for the Wikimedia Foundation's Technical Engagement team. I will be acting as the project lead with help from @srishakatux and others. I plan on posting periodic project updates at https://meta.wikimedia.org/wiki/Toolhub as work progresses.