Unconference notes:
MediaWiki federation
TechConf 2019
- Tgr: Some people are working on a project for having wiki cluster on cloud services - can be used to test ou tnew project ideas
- Foundation had an exploratory phase 'til 2008, several new projects got created (wikiversity, wiktionary, wikinews...)
- After 2008 exploration largely stopped
- Experimenting with a project incubator wiki farm
- TGR: hook wikis with central sources!?
- [slide] What are the things a wiki might want to get from a central source?
- InstantCommons and other foreign file repos (and global image usage)
- Wikidata and other uses of a Wikibase client-server relationship
- content sharing
- identity sharing
- imported content (and ExternalUserNames)
- global user prefs
- in the future, maybe cross-wiki page forks per T113004
- maybe global templates and gadgets?
- https://phabricator.wikimedia.org/T216112 "support data sharing in complex networks of MediaWiki wikis"
- TGR: Most of functionality currently assumes you're in the same db cluster
- Hoping we can move to a world where we rely mostly on APIs
Discussion
- BD: This has been a discussion for a long time.
- BD: This has a lot of overlap in general with what Birgit has been calling the small wiki toolkits.
- We have a platform for collaborative knowledge, but on small wikis we do not have rules /governance / workflows. We don't have tools to help patrol, deal with spam etc
- AdamShortland: from commons/wikidata pov. Data from commons are just on wikidata yeah! Lot of services are implemented based on a database lookup.
- once things get updated, how cache get invalidated. All this work for multiple wikis support needs some API that are nice to use
- wikidata has items and properties, commons has wikimediainfo. How do you get the link? That has to be in wikidata.
- A use case for a small wiki could be: I want my entire template namespace to come from that other wikis. Or even 3 remote places, potentially fallback to a local implementation?
- Piotr: For mobile frontend / mobile development web, we came up with idea of content provider - we just point to production
- Works well to retrieve templates, articles etc
- The problem is the frontend - we need to overwrite base URL that Ajax requests go to - CSRF
- Addshore: We have entity sources which are essentially the same thing as content provider
- Piotr: one can use your own database, MCS, mw media api
- Proxying: How to do proxying for production RESTBase
- AM: We have 3 different implementations?! Entity sources in Wikibase, Content Provider in Mobile Frontend, and ForeignAPI in MediaWiki core
- AS:
- BD: InstantCommons has 2 functional modes
- AS: mediawiki/core with all the refactoring that happened recently will make it easier to implement a potential federated / cross wiki system.
- tgr: taking InstantCommons as an example, there is not much way to find reuses of content. A file might be deleted for copyright, but nothing would notify users (other wikis) of that content
- mediawiki/core doesn't really have concept of files authorship (?)
- Piotr: Would be nice to talk about some general... Domain driven design?
- Tgr: URLs ...
- AS: Versioning of APIs - using the action API would be possible, but a lot more effort if there were nicer APIs.
- BD: You could version the action API
- Piotr: Quite difficult to imagine a 3rd party wiki that would like to act as a proxy
- AS If global templates existed - you could say that one of my sources for templates is going to be English Wikipedia, for example
- P: If we talk about sourcing data from place to place - do I want to have mulitple data sources?
- AS: In terms of Wikibase we definitely see use cases for multiple places
- We should keep ... caching and performance in mind.
- What happens when you have 2 wikipedias with the same page title?
- BD: Search being federated. How would you discover templates/modules/gadgets that are out there somewhere else?
- AS: We're going to have to do that for Wikibase. The moment you have a Wikibase that can use properties from both... Combined on the back end, calls out for one or more other things, reconcile and rank the results...
- AM: Do we want to federate MediaWikis or centralize them?
- Take search for example - search it on several wikis - algorithmically complicated.
- If you had everything in a single wiki.
- AS: Our initial version is gonna be the local versions appear first and then you can choose the remote one.
- AM: You could make the same point for templates for example.
- BD: But now you have one single template
- AS: Not having it centralized makes it possible to differentiate between use cases and select / pick solely the ones you are interested in
- There could be a global template source
- ?: Template transclusion?
- BD: Funding. 2030 strategy sort of level.
- To be the infrastructure for open and the hub of free knowledge
- freeknowledge.wikipedia.org ?!! Could federate all the best content in a central place
- One master URL space
- AS: 3rd party users
- Piotr: the all sourcing data from others places, is that single or both ways? Could we write back to a foreign wiki?!
- AS: Wikidata bridge
- Clients have a copy of API and talk to that
- Tgr: going to wikidata to vandalize pages that include data
- Tgr: This (federated editing?) is largely a social problem.
- notifications / messages to the central user page should (?) be propagated
- global user preferences / federated / global user prefs
- page forking
- Daren you could imagine having a copy of a wiki on raspebery pi for places without internet. Having edits there and then upstreaming the content produced, requiring tools to handle the potential merge conflicts.
- BD: main thing missing in data storage is putting an id for the wiki in everything
- Speed - database access is faster than HTTP
- You'd need caching
- Change propagation
- BD: Without the edge caching we do on English Wikipedia, there's no way we could serve any of our content - worrying about if it needs to be baked all the way down in would probably keep you from getting anything done
- Tgr: What is stage 1 for this?
- Volunteer work
- Global authentication
- Global userpages
- BD: Writing up an RfC about the conceptual idea of federation
- How do you make it clear to users how to edit something federated from another source?
- How do you see changes that have affected something on your watchlist?
- Oversight / deletion / concerns with content going away
Action items:
- Write the RfC
- BD: "Federated RfC should have federated authorship"
- AS: Wikibase side, we've got a lot of stakeholders
<(o^o)>