Page MenuHomePhabricator

Adapt search script to scan all wikis for geopoint usages
Closed, ResolvedPublic3 Estimated Story Points

Description

In T295775, we wrote an Elixir commandline script search-mapframe-insource which searches all wikis using the insource search query, for the purpose of counting mapframe usages. We want to add two more columns to the per-wiki statistics emitted by MapframeSearchInsource (search_insource --maps):

  • How many pages include a mapframe with a "geopoint" and "ids" external data source.
  • How many pages include a mapframe with a "geopoint" and a "query" external data source.

Implementation steps:

  • Write an insource regex for geopoint maps with ids. /[^|]geopoint[^|]*ids/ example in dewiki
  • Write an insource regex for geopoint maps which make a query. /[^|]geopoint[^|]*query/ example in dewiki
  • Integrate these regexes into the search-insource script, under a new function.
  • Document the new metrics in our internal analytics catalog.

Note that some work has already been merged which generalizes this tool and allows arbitrary search terms. Feel free to write the queries as one-off script runs, or integrate as a specialized module.

Nice to have:

Result
https://gitlab.com/wmde/search-all-wikis/-/merge_requests/5/diffs

Event Timeline

awight set the point value for this task to 3.Sep 29 2022, 1:00 PM