The overall question for this investigation: Is the work that we could do on this project over the next six months going to make a version of "stuctured data on Commons" happen faster?
If the Wikidata team understands and agrees on how to break the problem down, and we can pick up some of the pieces that they don't have time/resources to do, then that's a good situation for us to help with. But if they're still figuring out how to do some of the foundational work, then we might just be getting in their way.
Essentially: is this a project where adding more people makes it go faster, or slower?
Related to the #6 item on the Wishlist Survey: T120451: Allow categories in Commons in all languages
Translating Commons category names will add extra layers of complexity to an already hard-to-use category system. The best way for Community Tech to contribute towards this goal is to support Wikidata's work on supporting structured data on Commons:
T68108: [Epic] Store media information for files on Wikimedia Commons as structured data
T125822: [Epic] Basic first prototype for structured data support for Commons
In March, Lydia gave us the following list of tasks that would support this work. Some of these have already had some work, although none of them are closed.
This investigation ticket is to determine: What can our team actually work on, to help make structured data support happen?
Directly helping:
- T89733: Allow ContentHandler to expose structured data to the search engine. (Discovery is currently working on this)
- T76007: [Epic] add ability to link/refer to foreign items and properties (federation) (ability to use items/properties from Wikidata to make statements on other wikis)
- T127929: [Story] Add a new datatype for linking to creators of artwork and more (smart URI)
- T107595: [RFC] Multi-Content Revisions (with Daniel K)
- Thoughts/concepts on integration of query and search in the context of multimedia metadata
Indirectly helping, comment on these RFCs:
- T487: RfC: Associated namespaces
- T114640: make Parser::getTargetLanguage aware of multilingual wikis
- T114662: RFC: Per-language URLs for multilingual wiki pages
Update, Niharika's meeting with Lydia at Wikimania:
The main thing they want to do is have a new Type on Wikidata, like we have Items and Properties only right now. They want a new type for storing media info. I asked her if it would be similar to Item and she said it'll include some of the Item properties but some new ones also, which is why a new type.
- Ability to use items/properties from Wikidata to make statements on other wikis: T76007
If I upload a picture of a Mango tree on Commons, I should be able to pick what kind of tree, what color etc. from Wikidata options (sort of like an auto-complete interface for specific properties the user chooses). If it's something new, the data has to first go on Wikidata and then can be used on the wikis.
- A new Wikibase datatype for smart URIs: T127929
Wikibase (the software that Wikidata runs on) supports these data types as of now: https://www.wikidata.org/wiki/Special:ListDatatypes They want support for a new data type: https://phabricator.wikimedia.org/T127929 -- Basically that we accept the user's profile link (for a bunch of possible sources) and display only the relevant handle while retaining the URI link underneath.
- Multi-Content Revisions T107595 (in close collaboration with Daniel)
Need to talk more to Daniel about this one.
- Thoughts/concepts on integration of query and search in the context of multimedia meta data
Ability to run complex searches from the wiki itself. For example:
{{dog:white|male|poodle}}
should turn up all images of dogs with those specifications. The syntax and logistics of this task are still up in the air and possibly dependent on the first task being completed.