Page MenuHomePhabricator

timedtext videoinfo should not return protocol relative src urls
Closed, ResolvedPublic

Description

Follow up to T122736: Provide API to detect TimedText for a video

"timedtext": [
                            {
                                "src": "//commons.wikimedia.org/w/index.php?title=TimedText:Folgers.ogv.de.srt&action=raw&ctype=text%2Fx-srt",
                                "kind": "subtitles",
                                "type": "text/x-srt",
                                "title": "TimedText:Folgers.ogv.de.srt",
                                "provider": "local",
                                "srclang": "de",
                                "dir": "ltr",
                                "label": "Deutsch (de) subtitles"
                            },

Event Timeline

Change 290299 had a related patch set uploaded (by TheDJ):
Generate src timedtext links with current protocol

https://gerrit.wikimedia.org/r/290299

TheDJ triaged this task as Low priority.May 23 2016, 7:38 PM

Why have a src URL at all? The URL follows the standard index.php parameters, and the client should have a library which has a class for a wiki page title, which should be used when fetching, so the fetch goes through the libraries stack correctly, doing appropriate authentication for private wikis, logging, retry management, etc. If the client must use this src value, they will need to parse it before creating the necessary title object. (Pywikibot had to do this with Flow API results.)

Because we will have clients (instant commons) that will use this to get the list of timedtext tracks, and we don't want to tie the implementation of the url building to that remote user-agent.

Especially since I'm also considering switching the endpoint. action=raw is terrible

Do you mean Instant Commons functionality residing in a MediaWiki service, or a player within a user agent connected to that MediaWiki service.

Proper API clients will likely fetch using the revisions module, so it feels like these src URLs are here to support end-user players where a little API usage gets some of the info and then switches to index.php access. What worries me is that index.php usage is going to be a mandatory part of timedtext, or hacks will be needed. We did the latter for flow. It is ugly and fragile, but I'd prefer that than starting to rely on any index.php functionality, especially accessing raw urls given by the api, which could bypass proxy configuration, etc.

Change 290299 merged by jenkins-bot:
Generate src timedtext links with current protocol

https://gerrit.wikimedia.org/r/290299

@jayvdb Yes, mostly InstantCommons

I have an outstanding patch that will change the source url to actually serve the subtitle files, so that they are separate from revisions. Revisions are the storage format, but the storage format does not necessarily have to be the output format.

TheDJ moved this task from Doing to Done on the TimedMediaHandler board.
TheDJ removed a project: Patch-For-Review.