Page MenuHomePhabricator

File usage by Phonos should be registered and tracked
Closed, ResolvedPublic2 Estimated Story Points

Description

I just noticed that it seems that Phonos is not registering when it is transcluding a file with the file= syntax.

Take https://en.wikipedia.beta.wmflabs.org/wiki/Phonos_without_any_IPA
Note how the following file is in use:
https://en.wikipedia.beta.wmflabs.org/wiki/File:Voiceover-mathml-example-1.wav

Note how this usage was not registered in the file usage table and the article where it was used is not listed in the file usage section:
https://en.wikipedia.beta.wmflabs.org/wiki/File:Voiceover-mathml-example-1.wav#filelinks

This should be a requirement for launch, as without it, invalidation for such file usages will not work.


Notes

The function for registering file dependencies is ParserOutput::addImage()

renderPhonos() in includes/Phonos.php already has conditional logic for if a file is passed and already has a Parser object available.

We can use Parser::getOutput() to get a ParserOutput object, and then call addImage() as shown below

$parser->getOutput()->addImage( $options['file'] );

Acceptance criteria

  • Instances of Phonos which use a file correctly register a file dependency

QA

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Thinking this can be resolved as part of T324102: Attribute Phonos audio to file or text-to-speech — if a wikilink to the file is present on a page (via an i icon or something?) then that should generate the backlinks automatically?

EDIT: It does not 🙃 need to use addImage()

I think it should register as a file usage, not a mere link.

Ugh, I just wrapped up T326163: Add page properties for Phonos usage data for purposes of tracking (specifically for identifying unused files on the file system, T324233). File invalidation isn't something we considered, because all parameters to Phonos are used to build the filename. Thus, any change would have to generate a new file. In other words, I don't think we'd ever update existing files, so is there any reason for them to be tracked in this way? Extension:Score, which is similar in many aspects, does not track its files like this either (more precisely, it does call ParserOutput::addImage() but only with the deprecated override_midi option).

At any rate, following T326163, with some additional effort I think we can essentially do the same thing for invalidation using $wgPagePropLinkInvalidations. However I'm not sure it's needed.

I personally was under the impression that what we're basically creating are pseudo-files. They can't and shouldn't ever have File pages or be used in wikilinks. I'm no expert in this area, but if it truly does make sense to track Phonos files like normal files, then we've wasted a lot of time trying avoid going that route!

Pretty sure this task is only about file="". Which indeed should be tracked, otherwise a file may be mistakenly deleted for being unused.

Ah, clearly I read too fast haha! My apologies. Yes indeed we this a must for file= usage :) It's also a great microtask which I see is tagged with WMF-Internships-2023 so I'll leave this alone :)

TheresNoTime updated Other Assignee, added: S_Mukuti; removed: TheresNoTime.
TheresNoTime updated the task description. (Show Details)
TheresNoTime set the point value for this task to 2.
TheresNoTime updated the task description. (Show Details)

Change 895712 had a related patch set uploaded (by Samtar; author: Samtar):

[mediawiki/extensions/Phonos@master] Phonos.php: Register file usage

https://gerrit.wikimedia.org/r/895712

Change 895712 merged by jenkins-bot:

[mediawiki/extensions/Phonos@master] Phonos.php: Register file usage

https://gerrit.wikimedia.org/r/895712

Mentioned in SAL (#wikimedia-releng) [2023-03-15T09:01:42Z] <dwalden> (deployment-prep) dwalden@deployment-mwmaint02:~$ mwscript maintenance/refreshLinks.php --wiki enwiki --category='Pages that use Phonos' for T327708

@TheresNoTime Should the file usage be registered if it is not an audio file? For example, the file usage for https://en.wikipedia.beta.wmflabs.org/wiki/File:Cover_Lutung_Kasarung_Wikibook.png lists https://en.wikipedia.beta.wmflabs.org/wiki/Phonos_non-audio_file.

Perhaps it does not matter as people are unlikely to do this.

@TheresNoTime Should the file usage be registered if it is not an audio file? For example, the file usage for https://en.wikipedia.beta.wmflabs.org/wiki/File:Cover_Lutung_Kasarung_Wikibook.png lists https://en.wikipedia.beta.wmflabs.org/wiki/Phonos_non-audio_file.

Perhaps it does not matter as people are unlikely to do this.

I hadn't thought of that… I guess it makes sense to only register audio file usage.

Actually, I was thinking the opposite: that it does make sense to track the usage, even though it's wrong. Because if someone's trying to use a jpg as an audio file, it's still a usage and is useful to know when e.g. deleting that file (or whatever).

Actually, I was thinking the opposite: that it does make sense to track the usage, even though it's wrong. Because if someone's trying to use a jpg as an audio file, it's still a usage and is useful to know when e.g. deleting that file (or whatever).

works for me 😹

dom_walden closed this task as Resolved.EditedMar 15 2023, 1:33 PM
dom_walden moved this task from QA 🐛 to Done 🏁 on the Community-Tech (CommTech-Sprint-42) board.

Actually, I was thinking the opposite: that it does make sense to track the usage, even though it's wrong. Because if someone's trying to use a jpg as an audio file, it's still a usage and is useful to know when e.g. deleting that file (or whatever).

works for me 😹

OK, that makes sense.

I wrote a script to scrape the files Phonos was using on each page on enwiki beta and check that the page was included in the file usage table (using the list=imageusage API).

This includes files included via both the file and the wikibase parameters.

It also checks that the files listed in the page's prop=images also appear on the page.

I also checked that when Phonos tags were deleted the respective file usage tables were updated so the page is no longer listed.

EDIT I also checked that Phonos files or wikibase items transcluded in templates also appear in their respective file usage tables (e.g. https://en.wikipedia.beta.wmflabs.org/wiki/Phonos_Template_Usage).

Test environment: https://en.wikipedia.beta.wmflabs.org Phonos 0.1.0 (ba0fbfd) 06:47, 15 March 2023.

Mentioned in SAL (#wikimedia-releng) [2023-03-31T14:26:25Z] <dwalden> (deployment-prep) dwalden@deployment-mwmaint02:~$ mwscript maintenance/refreshLinks.php --wiki enwiki --category='Pages that use Phonos' for T327708

Mentioned in SAL (#wikimedia-releng) [2023-03-31T14:34:50Z] <dwalden> (deployment-prep) dwalden@deployment-mwmaint02:~$ mwscript maintenance/refreshLinks.php --wiki en_rtlwiki --category='Pages that use Phonos' for T327708