The relevant W3C standards are:
* https://www.w3.org/TR/annotation-model/ (for the annotations), and
* https://www.w3.org/TR/annotation-protocol/ (for the HTTP API).
As I understand it, the way the pieces fit together is:
1. On https://commons.wikimedia.org/wiki/File:Douglas_adams_portrait_cropped.jpg we insert in the `<head>`:
```
<link rel="http://www.w3.org/ns/oa#annotationService" href="https://commons.wikimedia.org/wiki/File_annotations:Douglas_adams_portrait_cropped.jpg"/>
```
(This could also appear in an HTTP `Link` header.) Spec: https://www.w3.org/TR/annotation-protocol/#discovery-of-annotation-containers
2. This makes the `File_annotations` page represent an "Annotation Container", which can be retrieved by GET according to https://www.w3.org/TR/annotation-protocol/#container-retrieval -- and in particular, you should be able to use the `Accept` header to request the `application/ld+json` type (note, this is slightly different from the usual `application/json` type).
3. The returned AnnotationContainer should be in the form: (see this on the [JSON-LD playground](https://json-ld.org/playground/#/gist/5153eb144a6b5703c58998963324e31f0c91185e03d8115489d595acb1c4a53a))
```
HTTP/1.1 200 OK
Content-Type: application/ld+json; profile="http://www.w3.org/ns/anno.jsonld"
{
"@context": [
"http://schema.org/",
"http://www.w3.org/ns/anno.jsonld",
"http://www.w3.org/ns/ldp.jsonld"
],
"id": "https://commons.wikimedia.org/wiki/File_annotations:Douglas_adams_portrait_cropped.jpg",
"type": [
"http://www.w3.org/ns/ldp#BasicContainer",
"AnnotationCollection"
],
"modified": "2017-05-06T12:00:00Z",
"label": "Annotations for File:Douglas_adams_portrait_cropped.jpg",
"first": {
"id": "https://commons.wikimedia.org/wiki/File_annotations:Douglas_adams_portrait_cropped.jpg?page=0",
"type": "AnnotationPage",
"items": [
{
"id": "https://commons.wikimedia.org/wiki/File_annotations:Douglas_adams_portrait_cropped.jpg#a1",
"type": "Annotation",
"body": {
"type": "image"http://www.wikidata.org/prop/direct/P1442",
"source": "https://www.wikidata.org/wikientity/Q42"
},
"target": [
{
"selector": {
"type": "CssSelector",
"value": "#file img[data-file-width]"
},
"source": "https://commons.wikimedia.org/wiki/File:Douglas_adams_portrait_cropped.jpg",
"state": {
"type": "TimeState",
"sourceDate": "2017-05-06T13:30:00Z"
}
},
{
"source": "https://upload.wikimedia.org/wikipedia/commons/c/c0/Douglas_adams_portrait_cropped.jpg",
"state": {
"type": "TimeState",
"sourceDate": "2017-05-06T13:30:00Z"
}
}
]
},
{
"id": "https://commons.wikimedia.org/wiki/File_annotations:Douglas_adams_portrait_cropped.jpg#a2",
"type": "Annotation",
"body": [
{
"type": "TextualBody",
"format": "text/wikitext",
"value": "Free '''text''' annotation"
},
{
"type": "TextualBody",
"format": "text/html; charset=utf-8; profile=\"https://www.mediawiki.org/wiki/Specs/HTML/1.4.0\"",
"value": "Free <b>text</b> annotation"
}
],
"target": {
"selector": {
"type": "FragmentSelector",
"conformsTo": "http://www.w3.org/TR/media-frags/",
"value": "xywh=50,50,640,480"
},
"source": "https://upload.wikimedia.org/wikipedia/commons/c/c0/Douglas_adams_portrait_cropped.jpg",
"state": {
"type": "TimeState",
"sourceDate": "2017-05-06T13:30:00Z"
}
}
}
]
},
"total": 2
}
```
Open questions:
* Writing a permalink to refer to an image is actually harder than I expected. Although we have nice permalinks for the `File:...` page, that's actually only the metadata for the image. The actual image is served from (say) https://upload.wikimedia.org/wikipedia/commons/c/c0/Douglas_adams_portrait_cropped.jpg **which is then moved to a different archive url** like https://upload.wikimedia.org/wikipedia/commons/archive/c/c0/20100416225428%21Douglas_adams_portrait_cropped.jpg if/when the image is updated. We apparently can't know the archive URL without predicting when the file is going to get updated. As a fallback, we're using the memento mechanism (which isn't actually implemented yet: T164654) -- T164654 -- although there is [Extension:Memento](https://www.mediawiki.org/wiki/Extension:Memento)). @GWicke suggests content-hash-based URLS (T149847). Multi-content revisions might also provide a fix.
* We can support multiple targets, but if you include the `File:...jpg` page as a target, it seems you need to include a quite complicated XPath or CSS selector to extract the actual image on the page.
* How to handle multiple resolutions of the image? This is related to the previous item, as the image embedded in the HTML `File:...jpg` page may not be full-size.
* Is there a way to avoid repeating so much? In particular the target array gets very repetitive, especially if we add all the scaled versions of the image as alternate targets.
* ~~We're using the http://schema.org/image relation, but we'd rather use the Wikidata [P18](https://www.wikidata.org/wiki/Property:P18) relation. In order to do that, we need wikidata to export a proper vocabulary (https://phabricator.wikimedia.org/T44063#3241034). Seems like https://github.com/schemaorg/schemaorg/issues/1186#issuecomment-221991582 gives us some ways to do this.~~
** Tweaked to use the wikidata P1442 relation as a `type` on the `body`, but I'm not convinced this is the correct way to indicate this semantic triple.
* We're not really using the paging interface. If the number of annotations got really large, we might need to figure out how to name each page of annotations.