Add html mode to {{#get_web_data:}}.
If used with | use xpath, address DOM nodes with XPath; otherwise, with CSS/jQuery-style selectors. Extend their syntax with an optional .attr() tail: e.g., h2 a.attr('href') or h2 a.attr (href) will both effectively give //h2//a/@href.
In both cases, parse HTML with DOMDocument and DOMXPath standard classes.
CSS mode requires a new dependency: symfony/css-selector (already required by Semantic MediaWiki).
Below are two links to a site where the proposed features have been implemented: