Page MenuHomePhabricator

[Session] What i've learned from being a Wikimedia tool dev for nine years
Closed, ResolvedPublic

Description

  • Title of session:

What i've learned from being a Wikimedia tool dev for nine years

  • Session description:

Depictor, Tools Directory, Vizquery and Structured Search are some of the tools i've built over the last nine years. In this workshop, I will share my experience of developing tools for Wikimedia over the past nine years, discussing the challenges I've faced and the lessons I've learned along the way. This session will also be an opportunity to explore how we can improve the tools development process going forward and to engage in a discussion with other participants about their own experiences and ideas.

Event Timeline

Husky renamed this task from [Session] What i've learned being a Wikimedia tool developer for nine years to [Session] What i've learned from being a Wikimedia tool developer for nine years.Mar 23 2023, 12:33 PM
Husky renamed this task from [Session] What i've learned from being a Wikimedia tool developer for nine years to [Session] What i've learned from being a Wikimedia tool dev for nine years.
Husky updated the task description. (Show Details)
Husky moved this task from Backlog to Proposed sessions on the Wikimedia-Hackathon-2023 board.
Husky updated the task description. (Show Details)

Notes from the Etherpad:

What I've learned from being a Wikimedia tool dev for nine years

(i don't know who made these notes, but thank you very much! - HK)
Date & time: Saturday, May 20th at 11:30 am EEST / 8:30 am UTC

Relevant links

haykranen.nl

Participants ~40

Presenter

Hay Kranen (@Husky)

Notes

Two parts to the talk: life before becoming a tool dev, and how he sees it now

First started building tools in 2014, but really have to go back to 1983 (when he was born)
fascinated with digital technology from an early age (particularly light switches)

first computer: Commodore 64 microcomputer
from 6/7 - 15 wrote programs (showed an example in Q-Basic to make, load, and save drawings)
pivotal moment in 1998: got internet access (skipped dial-up and went straight to cable)
wrote his first website which was a shining example of the design style of the time
also wrote an anti-mobile phone website, and made a website for his father's art (which is still online! -> https://www.bykr.org/dutchart/)
(remarkable thing about websites: they can stay online forever)

decided to pursue career in web development
made a documentary movie as part of his studies about Piet Mondrian and Theo van Doesburg, and while doing research discovered wikipedia, discovered that he could edit and add information

Reading: "Because Internet - Understanding the New Rules of Language" - Gretchen McCulloch
Gretchen McCulloch's concept of "old internet people": people who saw the internet in its infancy, are still excited about the possibilities, believe in the concept of the internet as a global community

2013: hired as the first "wikipedian in residence" at the national library/archives of the Netherlands
https://nl.wikipedia.org/wiki/Gebruiker:Husky#Wikipedian_in_Residence
https://en.wikipedia.org/wiki/Wikipedia:GLAM/National_Library_and_National_Archives_of_the_Netherlands
at this moment started to build his first wikimedia-related tools
noticed quickly that it was difficult to get metrics about what you were doing; decided to build tools for the national library (e.g., way to see the number of external links to national library sites on Wikimedia sites)

https://toolhub.wikimedia.org/tools/hay-exturl
https://hay.toolforge.org/exturl

remade https://www.wikipedia.nl homepage

Created https://hay.toolforge.org/directory/
(See current list of tools at https://toolhub.wikimedia.org/search?author__term=Hay%20Kranen&ordering=-score&page=1&page_size=12)

not every tool was successful

first official tool: KB Permalink (2014)
JS bookmarklet to generate a permalink to Dutch National Library catalogue records

Acropolis theory of tool development:

(digression here to give a brief history of the Parthenon)
he was surprised at the number of times the building has been damaged, destroyed, repurposed, rebuilt
how does that relate to tool development?

    both are using iterative development:  building on old fundaments, building and rebuilding, changing and reusing things

whenever he builds a tool, he doesn't want to start from scratch; he'll take a recent project, copy the fundaments and build on that.  can also build on other people's work (e.g., VizQuery (https://hay.toolforge.org/vizquery/), a tool that he made using a library to parse SPARQL in JS)

another aspect of the parthenon's history: A history of war and violence
how does that relate?

many of his tools were built due to his own frustration/irritation: a history of irritation and itches to scratch.

example:  his frustration at the problem of "tool discovery". Everyone was using different standards to document the metadata of their tools, and there was no centralized location.

wrote a proposal and a code example; now, Toolhub (https://toolhub.wikimedia.org/) is built on the same .json specification that he developed

tools are the work of many hands:

Structured Search (https://hay.toolforge.org/sdsearch/) was built in response to requests/nagging from other people and translated i18n to many languages
Another example:  Depictor (https://hay.toolforge.org/depictor/)  -- has recently passed the "1 million edits made using this tool" mark
Created a set of challenges, tools that encourage people to participate and add information

another way to collaborate: use github to make getting feedback as easy as possible for users (dealing with github is, in his opinion, easier than phabricator), whenever he receives a bug-report or feature-request, he makes sure it's filed as an issue, and assigned to himself, so that it can eventually be done (perhaps years in the future)

tools, like the parthenon, can be damaged by neglect

difficult to keep up to speed with changes in development, best practices

he's also good at starting things, but not necessarily good at finishing them

also a problem with knowledge loss:  forgetting what he was trying to do (to deal with this, writes error messages to his "future self", e.g. "THIS DOES NOT WORK PLEASE FIX")

dealing with "tool fatigue" (periods of heavy activity followed by lack of activity)

his thoughts in 2023:

he'd thought about discussing "best practices," but after thinking about it isn't sure that there's any such thing for tool builders.  everyone tends to do it their own way.  having guidelines and restrictions would have discouraged him from keeping active

Q&A

Comment: a mono-repo containing all of a developer's tools can make it difficult for other people to contribute. Does he have thoughts on the balance btw developer convenience & helping the user?

Optimizing for contributions would mean working more on documentation (which he does for some of his tools), but with WMF tools he has so many running that it's hard to know where to begin. Might be worth separating his most used tools into separate repos. One thing he's tried to do is to keep the tools as loosely coupled as possible. (His older tools were more closely joined but over time he learned that it was a maintenance nightmare.) One small drawback of proper separation, is that it requires more duplicated dependency-updates.

Comment: Luca is glad (humourously) to see that the Dutch Central Library also had trouble with permalinks (had the same issue with the Italian National Library)

He's learned how important it is to have a set of best practices (e.g., things should work on mobile; descriptions should be simple to understand). But we shouldn't assume that the best practices that we have in the WM community are used widely outside it (e.g., permalinks)

Another thing he learned from the WM community: looks aren't everything. An ugly website can be well constructed and interesting. (Good permalinks, meta-data, semanticly structured content, accessible, etc)

Q: How did you become a Wikipedian in Residence?

A: After making his first edit on the Dutch wikipedia he became more involved with the Dutch wikipedia community, developed a reputation as "the wikipedia guy" externally. When there was a vacancy he applied and felt that his history / connections / network played a role in his selection.

Q/Comment: What is the cause of his "great age of neglect?" (He hasn't done much tool work in the past three years) Why is he more motivated at some times than at others?

A: His periods of greatest activity occurred after events like the hackathon.

Q: Any advice for getting more code contributions from others?..

A: Most helpful contributions for him are when people write really good bug reports. Hasn't gotten a large amount of useful code contributions from others
Reproducible issues, with specific links, and details about what went wrong.

The more you show as a contributor that you're invested in the problem you're trying to solve, the more interested he will be in helping to fix the problem. (The less work that he needs to do, the better.)

Comment from a contributor's POV: The contributor is also trying to figure out how invested the maintainer is in maintaining the tool (also doesn't want to put a lot of effort in if the maintainer is not interested.) Contributor will often start with something small (even fixing a typo), and grow into larger contributions over time. Recommendation: Don't feel obliged to write detailed documentation at the start, but DO create a "how to run this tool" section in the Readme.md or similar, and maybe placeholders where more documentation is wanted (that contributors could help with).

Hay: when he's working on projects outside of WM, always working with a project manager, which helps him maintain a balance between various necessary tasks; when working on a private tool, it's harder to manage time (it all comes down to time)

Q: Has anyone in the room had an external contributor make a significant contribution?

Bryan Davis: has had various levels of external contributions: typo fixes and a lot of similarly small contributions. Much rarer to get major feature additions.

Q: Or minor additions?

BD: Yes, some minor additions. e.g. http://tools.wmflabs.org/versions/ a team that uses this had a number of "scratch their own itch" feature additions to make it even better for their needs.
But in general, version control, READMEs, a clear place for people to report bugs, a place where people can give notes (contributor.doc or something similar). try to be nice when people show up, even if what they're giving you isn't quite what you want. try to be open and accepting.

Comment: One thing that would help contributors: an overview of the codebase

Hay: includes a "philosophy" section in his readmes (e.g. see https://hay.github.io/dataknead/#philosophy) and it helps him to see people describe why they did what they did, their approach, etc. Worthwhile to document intentions as well as actions. For instance, if you've chosen to use a particular version of a library or framework (or choose not to)

Bryan: Thanks for Tool Directory, it gave James and I many good bases to build upon. You had good ideas and shared them in great ways.
Hay: Thanks for the way you approached the collaboration. You communicated clearly about what you liked and intended to keep, and what you were considering dropping and why. It made collaboration pleasurable.