It's not the first time, and it probably won't be the last: the goal of this task is to better understand the sources of fulltext search session abandonment, in hopes of finding opportunities for improvement where possible.
As a search stakeholder, when determining enhancement options it would be helpful to understand what things are possibly within the control of the search platform and which things are actually not easily treated by the search platform.
Currently, fulltext search session abandonment appears to hover near 48% on desktop.
{F57534802, size=full}
Acceptance criteria:
- For the top 10 most frequently visited Wikipedias classify the component(s) most likely for X (50?) fulltext search session abandonments apiece (approach for sampling probably includes a mix of fulltext head queries and some form of random sampling)
- As this is exploratory in part, identify apparent component(s) leading to fulltext search session abandonments
- Split the data based on original source of fulltext search, particularly when a user started fulltext search from a namespace 0 article view versus not that (this could be coarse grained like "not namespace 0" or could be finer grained)
- If possible, indicate possible alternative search or presentation strategy that could be used (this may pertain to both zero results and non-zero results cases) (BONUS if there's any way to automate this)
- Identify whether search could be satisfied with an external search engine / conversational agent with Wikipedia/Wikimedia content (be careful to not overwhelm and beware of filter bubbles)
- Identify whether search could be satisfied with an external search engine / conversational agent with not-Wikipedia/Wikimedia content (be careful to not overwhelm and beware of filter bubbles)
- Analyze and make a report
- Document the approach
- Describe potential next tickets to act on the data (or if data are fully inconclusive or nonactionable, describe why)
Some notes:
- Commons, Wikidata, and other sister projects excluded due to different interaction patterns
- The target for the analysis could be this task or a wiki page or both, and seems likely an access controlled Jupyter notebook or Sheet will be necessary to hand code the data
Thinking out loud, possible components of this seemingly moderately high abandonment rate include:
- Automata
- Inadequate content for things that probably could realistically exist on the given Wikimedia project if notability criteria are someday met
- Inadequate content for things that probably won't realistically ever exist on a given Wikimedia project given its content policies
- Inadequate search terms
- Overabundant search terms (e.g., natural language query too long)
- Typing accidents
- Copy-paste accidents
- Bad spelling guesses
- "Wrong keyboard" issue
- Referred search sessions that aren't same-site organic fulltext search (the traffic may be organic, or it may be organized activity)
- Non-automata UA spurious calls
- User power tools spawning lots of sessions likely to be less prone to clicks
- Sister search results sidebar clicks (?)
- Users who were actually satisfied by looking at the SERP. {T375387} intends to help identify one kind of "satisfied" but abandoned searcher behavior. Others can be harder to track, such as when search snippets satisfy search intent or when the UA configuration reduces well intentioned measurement instruments
- Data collection that inflates the denominator of fulltext search (perhaps when a page is served but the user didn't/couldn't see the SERP; maybe there are also unanticipated redirects or some such thing)
There are probably more potential components at play, but those are some that came to mind.
There can be overlap between these components. And there can be "fixes" above and beyond current approach to mitigate these components sometimes, ranging from updating measurement/visualization approach to applying different search or presentation strategies. Some of the potential components are likely observable in "fulltext head queries", and some likely are not.