Audio and video elements in Parsoid HTML may have <track> children that expose subtitles. See example HTML at https://www.mediawiki.org/wiki/Specs/HTML/1.6.0#Audio/Video.
<track kind="subtitles" type="text/x-srt" src="https://commons.wikimedia.org/w/index.php?title=TimedText:Folgers.ogv.de.srt&action=raw&ctype=text%2Fx-srt" srclang="de" label="Deutsch (de) subtitles" data-mwtitle="TimedText:Folgers.ogv.de.srt" data-dir="ltr"/>
These should be added to the media endpoint response as structured data. Only the list of subtitle information on <track> tags, with the URLs and other relevant attributes. Not the content of the subtitles themselves.