Page MenuHomePhabricator

Multi-word titles fail
Open, Needs TriagePublic1 Estimated Story PointsBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Search for an article having multiple words in the title (eg enwp.org/Canadian Indian residential school system

What happens?:
*Tool searches for Canadian_Indian residential school system (which is not a correct title and so shows 0 results)

What should have happened instead?:
*Likely it needs to replace all spaces with underscores rather than just the first one

Software version (skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):
*Firefox

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
StartView: Replace all spaces with _, not just firsttoolforge-repos/wlh!3egardnerT325983main
Customize query in GitLab

Event Timeline

I haven't looked yet to see whether this bug is in the frontend or backend, but I would suggest using the mediawiki-title (JS/npm) or mwtitle (Rust/cargo) packages for title validation and normalization.

I haven't looked yet to see whether this bug is in the frontend or backend, but I would suggest using the mediawiki-title (JS/npm) or mwtitle (Rust/cargo) packages for title validation and normalization.

I didn't realize mediawiki-title was available in NPM, I will work on a fix for this shortly.

I've posted a basic patch to fix the behavior (my regex was missing the g flag).

I spent some time looking at the mediawiki-title package, but I struggled to find a clear way to convert from a title containing spaces to a title containing underscores. We already get the namespace from the user's selection by relying on the prefixsearch API.

Our bespoke Rust back-end for the WLH app (https://gitlab.wikimedia.org/toolforge-repos/wlh-api) expects the titles to contain underscores and doesn't perform any normalization (though since there is a cargo package I guess we could consider adding this).

I'm open to improvements – in the mean time once this patch is merged the problem should be fixed.

egardner set the point value for this task to 1.May 10 2023, 12:36 AM