The inclusion of pages which a user wouldn't consider an article in the pageview/top API drastically hinders the usefulness of any feature using this data. Solving this problem at the lowest API level possible will allow downstream API clients (including middleware services) to use this data to build features with confidence (and w/o regexes and heuristics).
For example: https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/all-access/2016/01/18
"articles": [ { "article": "Main_Page", "views": 19257663, "rank": 1 }, { "article": "Special:Search", "views": 2144393, "rank": 2 }, { "article": "-", "views": 758591, "rank": 3 }, ...